Graphs

PATRICIA

Practical Algorithm to Retrieve Information Coded in Alphanumeric

Prefix Operations

1. all keys, sorted
2. read by prefix
3. update by prefix

How IntMap Works

Retrieval

`lookup “b”`

b = 1

`lookup “bab”`

b

`lookup “bab”`

b → a

`lookup “bab”`

b → a → b = 7

Prefixes

every key starting with “ba”

Tries

retrieve key character by character

Paths

two keys: 00011, 00001

Path Compression

two keys: 00011, 00001

Path Compression

fully compressed radix tree

Data.IntMap

binary trie with path compression

Data.IntMap

• `Leaf` (path, key, value)
• `Branch` (path, children)
• `Empty`

Data.IntMap

``````type Prefix = Int
type Mask = Int

data IntMap a =
Branch !Prefix
!(IntMap a) !(IntMap a)
| Leaf !Prefix a
| Empty
``````

lookup

``````lookup _ Empty = Nothing
lookup k (Leaf k' v)
| k == k'   = Just v
| otherwise = Nothing
``````

lookup

``````lookup k (Branch prefix control l r)
| getPrefix k control /= prefix = Nothing
| k .&. control == 0            = lookup k l
| otherwise                     = lookup k r
``````
``````getPrefix k control =
key .&. complement ((control `shiftL` 1) - 1)
``````

Beyond IntMap

Binary Trie

• binary: “bit-by-bit”
• binary: “two children”

Span

n bits = 2ⁿ children

Space vs Height

from ART paper

256 children (byte at a time)

Nodes

``````data ART a = Empty
| Leaf !Key a
| Node !Mask !Prefix !(Children a)
``````

Nodes

``````type Chunk    = Word8
type Chunks   = UArray Chunk Chunk
type Values a = Array Chunk a
type Size     = Word8

data Children a =
N4   !Size !Chunks !Values
| N16  !Size !Chunks !Values
| N48  !Size !Chunks !Values
| N256 !Size !(Array Chunk (Maybe a))
``````

• Java version
• good fit for Haskell?

Recap

Tries

• sorted keys
• prefix operations
• merging

Data.IntMap

• binary trie
• path compression

Beyond IntMap

• different spans