Skip to content

Commit 59254f5

Browse files
committed
completed readme for changes
1 parent 5f8561a commit 59254f5

1 file changed

Lines changed: 42 additions & 48 deletions

File tree

README.md

Lines changed: 42 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -25,21 +25,20 @@ Kotlin.**
2525

2626
## Overview
2727

28-
KEncode has **three focused entry points**, all aimed at compact, ASCII-safe
28+
KEncode has three focused entry points, all aimed at compact, ASCII-safe
2929
representations:
3030

31-
1. **ByteEncoding codecs**: `Base62` / `Base36` / `Base64` / `Base85`
31+
1. ByteEncoding codecs: `Base62` / `Base36` / `Base64` / `Base85`
3232
Low-level encoders/decoders for byte arrays when you already have binary
3333
data.
3434

35-
2. **Standalone BinaryFormat**: `PackedFormat`
36-
A binary serializer for flat Kotlin serializable classes. It uses a bitset
37-
for booleans and nullability, and varint encodings for integers. It avoids
38-
support for nesting or collections in order to keep the layout small and
39-
predictable. Use `kotlinx.serialization.ProtoBuf` when hierarchical
40-
structures are required.
35+
2. Standalone BinaryFormat: `PackedFormat`
36+
A compact binary serializer optimized for Kotlin. It supports arbitrary
37+
object graphs, including nested objects, lists, and maps. It uses
38+
per-object bitsets for booleans and nullability, and varint encodings for
39+
integers to keep the layout significantly smaller than standard formats.
4140

42-
3. **Standalone StringFormat**: `EncodedFormat`
41+
3. Standalone StringFormat: `EncodedFormat`
4342
A wrapper that applies a binary format, optionally appends a checksum, and
4443
then encodes the final byte sequence using a chosen `ByteEncoding`. This
4544
produces short, deterministic string representations suitable for external
@@ -104,23 +103,18 @@ val decoded = EncodedFormat.decodeFromString<Payload>(encoded)
104103

105104
You can use the encoders standalone on raw byte arrays.
106105

107-
```kotlin
108-
val bytes = "any byte data".encodeToByteArray()
109-
println(Base36.encode(bytes)) // 0ksef5o4kvegb70nre15t
110-
println(Base62.encode(bytes)) // 2BVj6VHhfNlsGmoMQF
111-
println(Base64.encode(bytes)) // YW55IGJ5dGUgZGF0YQ==
112-
println(Base85.encode(bytes)) // @;^?5@X3',+Cno&@/
113-
114-
// Decoding is symmetric:
115-
val back = Base62.decode("2BVj6VHhfNlsGmoMQF")
116-
```
106+
### 2. Update the **ProtoBuf serialization** section
107+
108+
The motivation is now changed from "complexity" to "interoperability."
117109

118110
---
119111

120112
## ProtoBuf serialization
121113

122-
For more complex payloads (nested types, lists, maps) use `ProtoBuf` as the
123-
binary format and still get compact, ASCII-safe strings:
114+
`PackedFormat` is optimized for Kotlin-to-Kotlin scenarios. If you need
115+
cross-language compatibility (consuming the payload in non-JVM languages) or
116+
require standard Protocol Buffer schema evolution, you can swap the binary
117+
format for `ProtoBuf`:
124118

125119
```kotlin
126120
@Serializable
@@ -207,44 +201,44 @@ val decoded = format.decodeFromString<Command>(encoded)
207201

208202
## PackedFormat explanation
209203

210-
`PackedFormat` is a `BinaryFormat` designed to produce very small payloads for
211-
**flat** Kotlin serializable classes. It avoids nesting and collections,
212-
allowing a compact and deterministic binary layout.
204+
`PackedFormat` is a `BinaryFormat` designed to produce the smallest feasible
205+
payloads for Kotlin classes. Unlike JSON or standard ProtoBuf, it uses a
206+
state-aware bit-packing strategy to merge boolean flags and nullability
207+
indicators.
213208

214-
### Field layout
209+
### Capabilities
215210

216-
For a single class, the encoding consists of:
211+
* Full Graph Support: Handles nested objects, lists, maps, and polymorphic
212+
types.
213+
* Bit-Packing: For every class in the hierarchy, all `Boolean` fields and
214+
`Nullable` indicators are packed into a single varlong header. For very or
215+
dynamic payloads a
216+
* VarInts: Integers can be encoded as VarInts via annotation (or ZigZag).
217217

218-
1. **Flags varlong**
219-
220-
A single varlong encodes:
221-
* **Boolean bits** — one per boolean property, in declaration order
222-
(`1` = true, `0` = false)
223-
* **Nullability bits** — one per nullable property
224-
(`1` = null, `0` = non-null)
225-
226-
2. **Payload bytes**
218+
### Field layout
227219

228-
After the flags, each non-boolean field is encoded in declaration order:
220+
For a standard class, the encoding follows this structure:
229221

230-
* Fixed primitives (`Byte`, `Short`, `Int`, `Long`, `Float`, `Double`)
231-
* `String`: `[varint length][UTF-8 bytes]`
232-
* `Char`: UTF-8 encoding
233-
* `Enum`: ordinal as varint
234-
* Nullable fields omit payload bytes when null
222+
1. Bitmask Header A single varlong containing:
223+
* Boolean bits — one per boolean property in the specific class.
224+
* Nullability bits — one per nullable property.
225+
*(This ensures that a class with 10 booleans and 5 nullable fields only uses ~2 bytes of overhead).*
235226

236-
A top-level nullable value is encoded with a single bit: `1` = null, `0` =
237-
present.
227+
2. Payload bytes After the flags, fields are encoded in declaration order:
228+
* Primitives: Encoded densely (VarInt for Int/Long, fixed for others).
229+
* Strings**: `[varint length][UTF-8 bytes]`.
230+
* Nested Objects: Recursively encodes the child object (starting with its own Bitmask Header).
231+
* Collections: `[varint size][item 1][item 2]...` (Collections do not use bitmasks; nulls in lists use inline markers).
238232

239233
### VarInt / VarUInt annotations
240234

241-
Use varint-style encodings for compact integer fields:
235+
You can further optimize integer fields using annotations:
242236

243237
```kotlin
244238
@Serializable
245239
data class Counters(
246240
@VarUInt val seq: Long, // unsigned varint
247-
@VarInt val delta: Int // zig-zag + varint
241+
@VarInt val delta: Int // zig-zag + varint (good for small negative numbers)
248242
)
249243
```
250244

@@ -255,15 +249,15 @@ data class Counters(
255249
`EncodedFormat` provides a single `StringFormat` API that produces short,
256250
ASCII-safe tokens by composing three layers:
257251

258-
1. **Binary format**
252+
1. Binary format
259253
Default is `PackedFormat`, but any `BinaryFormat` (e.g. ProtoBuf) can be
260254
used.
261255

262-
2. **Checksum (optional)**
256+
2. Checksum (optional)
263257
Supports `Crc16`, `Crc32`, or a custom implementation.
264258
The checksum is appended to the binary payload and verified during decode.
265259

266-
3. **Text codec**
260+
3. Text codec
267261
Converts the final bytes into a compact ASCII representation.
268262
Default is `Base62`, with alternatives such as `Base36`, `Base64`,
269263
`Base64UrlSafe`, `Base85`, or custom alphabets.

0 commit comments

Comments
 (0)