Skip to content

Commit b1d4f47

Browse files
authored
improve and run benchmark again for real world testing (#42)
* improve and run benchmark again for real world testing * add perf note to explain benchmark results * tweaks
1 parent 8be760a commit b1d4f47

File tree

3 files changed

+558
-125
lines changed

3 files changed

+558
-125
lines changed

.github/workflows/test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -117,4 +117,4 @@ jobs:
117117
run: isort --check-only --diff kinesis tests
118118

119119
- name: Lint with flake8
120-
run: flake8 kinesis tests --max-line-length=88 --extend-ignore=E203,W503
120+
run: flake8 kinesis tests --max-line-length=88 --extend-ignore=E203,W503,E501

README.md

Lines changed: 51 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# async-kinesis
22

3-
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/python/black) [![PyPI version](https://badge.fury.io/py/async-kinesis.svg)](https://badge.fury.io/py/async-kinesis) [![Python 3.7](https://img.shields.io/badge/python-3.7-blue.svg)](https://www.python.org/downloads/release/python-370/) [![Python 3.8](https://img.shields.io/badge/python-3.8-blue.svg)](https://www.python.org/downloads/release/python-380/)
3+
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/python/black) [![PyPI version](https://badge.fury.io/py/async-kinesis.svg)](https://badge.fury.io/py/async-kinesis) [![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/downloads/release/python-3100/) [![Python 3.11](https://img.shields.io/badge/python-3.11-blue.svg)](https://www.python.org/downloads/release/python-3110/) [![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/downloads/release/python-3120/)
44

55
```
66
pip install async-kinesis
@@ -169,9 +169,57 @@ Note:
169169

170170
See [benchmark.py](./benchmark.py) for code
171171

172-
50k items of approx 1k (python) in size, using single shard.
172+
The benchmark tool allows you to test the performance of different processors with AWS Kinesis streams.
173173

174-
![Benchmark](docs/benchmark.png)
174+
### Usage
175+
176+
```bash
177+
# Run with default settings (50k records)
178+
python benchmark.py
179+
180+
# Dry run without creating AWS resources
181+
python benchmark.py --dry-run
182+
183+
# Custom parameters
184+
python benchmark.py --records 10000 --shards 2 --processors json msgpack
185+
186+
# Generate markdown output
187+
python benchmark.py --markdown
188+
189+
# Run multiple iterations
190+
python benchmark.py --iterations 3
191+
```
192+
193+
### Example Results
194+
195+
50k items of approx 1k (python) in size, using single shard:
196+
197+
| Processor | Iteration | Python Bytes | Kinesis Bytes | Time (s) | Records/s | Python MB/s | Kinesis MB/s |
198+
| --- | --- | --- | --- | --- | --- | --- | --- |
199+
| StringProcessor | 1 | 2.7 MB | 50.0 MB | 51.2 | 977 | 53.9 kB | 1000.0 kB |
200+
| JsonProcessor | 1 | 2.7 MB | 50.0 MB | 52.1 | 960 | 52.9 kB | 982.5 kB |
201+
| JsonLineProcessor | 1 | 2.7 MB | 41.5 MB | 43.5 | 1149 | 63.4 kB | 976.7 kB |
202+
| JsonListProcessor | 1 | 2.7 MB | 2.6 MB | 2.8 | 17857 | 982.1 kB | 946.4 kB |
203+
| MsgpackProcessor | 1 | 2.7 MB | 33.9 MB | 35.8 | 1397 | 77.0 kB | 969.8 kB |
204+
205+
*Note: MsgpackProcessor performance varies significantly with record size. While it appears slower here with small records due to netstring framing overhead (~9%) and msgpack serialization costs, it can be faster than JSON processors with larger, complex data structures where msgpack's binary efficiency provides benefits.*
206+
207+
### Processor Recommendations
208+
209+
Choose the optimal processor based on your use case:
210+
211+
| Use Case | Recommended Processor | Reason |
212+
| --- | --- | --- |
213+
| **High-frequency small messages** (<500 bytes) | JsonLineProcessor | Minimal aggregation overhead, simple parsing |
214+
| **Individual JSON messages** | JsonProcessor | Maximum compatibility, no aggregation complexity |
215+
| **Batch processing arrays** | JsonListProcessor | Highest throughput for compatible consumers |
216+
| **Large complex data** (>1KB) | MsgpackProcessor | Binary efficiency outweighs overhead |
217+
| **Raw text/logs** | StringProcessor | Minimal processing overhead |
218+
| **Binary data or deeply nested objects** | MsgpackProcessor | Compact binary representation |
219+
| **Real-time streaming** | JsonLineProcessor | Simple, fast parsing with good compression |
220+
| **Bandwidth-constrained environments** | MsgpackProcessor | Smaller payload sizes |
221+
222+
**Performance Testing:** Use the benchmark tool with different `--record-size-kb` and `--processors` options to determine the best processor for your specific data patterns.
175223

176224

177225
## Unit Testing

0 commit comments

Comments
 (0)