Release v0.4.5
Highlights
The SGLang team is excited to the release of v0.4.5! This version introduces several significant features, including Llama 4 support, FlashAttention 3 backend, EAGLE3 speculative decoding, DeepEP integration, and disaggregated prefill and decoding.
New Features
-
Llama 4 Support: We supported Llama 4 model with accuracy matching official benchmark numbers, achieving a zero-shot score of 75.2 on the MMLU Pro dataset for
Llama-4-Scout-17B-16E-Instructmodel and 80.7 forLlama-4-Maverick-17B-128E-Instructmodel. #5092 -
FlashAttention 3 Backend: Our implementation of the FlashAttention 3 backend delivers significant acceleration for long-context tasks. #4709
-
EAGLE3 Speculative Decoding: We’re proud to be the first to support EAGLE3 speculative decoding, offering substantial gains in decoding throughput. Learn more in our documentation and the EAGLE3 paper. #4247
-
DeepEP Integration: By incorporating DeepEP, we enhanced performance for MoE inference.
-
Disaggregated Prefill and Decoding: We introduced a prototype for disaggregated prefill and decoding, with plans for further optimizations.
Thanks very much to the NVIDIA team, LinkedIn team, EAGLE team, Oracle team, Meituan team, and our incredible open-source community for their invaluable contributions!
Coming Soon
-
Disaggregated Prefill and Decoding: #4655
-
Llama 4 Optimization: #5118
-
EP Enhancement: #4734
-
FA3 Enhancement: #4709
We’re thrilled about these advancements and eager to hear your feedback! Join us on our Slack channel at slack.sglang.ai to connect and share your thoughts. Cheers!
What's Changed
- Fix a regression introduced by overlapping KV cache writing by @merrymercy in #4375
- Update ci_install_dependency.sh to use accelerate 1.4.0 by @merrymercy in #4392
- Improve DP attention by @merrymercy in #4390
- Fix auto merge & add back get_flat_data_by_layer by @merrymercy in #4393
- Add some fused elementwise kernels for grok-1 by @merrymercy in #4398
- Fix Llama3.3 tool call support by @CatherineSue in #4320
- Fix the output of hidden states after HTTP requests by @Qiaolin-Yu in #4269
- Add a dummy grok test case by @merrymercy in #4399
- Hot fix for hicache with new page aligned radixtree by @xiezhq-hermann in #4397
- bump v0.4.4.post1 by @zhyncs in #4402
- Update CODEOWNERS by @merrymercy in #4403
- Hierarchical Caching supports MLA by @zeroorhero in #4009
- cleanup deps 1/n by @zhyncs in #4400
- feat(remote_model): support variable remote backend for model loader by @DellCurry in #3964
- [bug] fix duplicate variable MAX_PIXELS in qwen_vl.py by @qibaoyuan in #4419
- [Doc] fix wrong flag in deepseek documentation by @lausannel in #4427
- Add moe topk softmax templated from vllm by @qingquansong in #4302
- bump v0.0.5.post1 by @zhyncs in #4437
- Fix maximum recursion depth triggered on exception exit by @merrymercy in #4438
- use topk_softmax with sgl-kernel by @zhyncs in #4439
- docs: hot fix torch compile cache by @zhaochenyang20 in #4442
- ci: update transformers==4.48.3 by @mickqian in #4451
- Fix test_create_kvindices unit test by @sleepcoo in #4452
- [Fix] Fix errors when using the device except cuda. by @cboss6 in #4455
- docs: Add Llama 3.3 to supported models by @JiangJiaWei1103 in #4453
- Update bench_serving.py by @xu-song in #4454
- bugfix: Update sampling_params.py by @WrRan in #4413
- typos: Update sampling_params.md by @WrRan in #4391
- Auto-detect device if not specified in server arguments. by @vshekhawat-hlab in #4423
- Add support for upcoming QwenMoe by @michaelfeil in #4447
- perf: update fused moe config by @mickqian in #4459
- typos by @WrRan in #4368
- Fix minor style by @merrymercy in #4460
- cleanup deps 2/n by @zhyncs in #4464
- feat: Add FlashMLA submodule by @shuaills in #4449
- [Fix] use
torch.catinstead oftorch.concatto prevent entering theAutogradbackends. by @Alcanderian in #4466 - Fix finish step for pr tests and notebook tests by @merrymercy in #4467
- Remove filter for pr-tests by @merrymercy in #4468
- Add greedy verification kernel by @Ying1123 in #4383
- Release sgl-kernel v0.0.5.post2 by @merrymercy in #4469
- Revert "feat: Add FlashMLA submodule (#4449)" by @zhyncs in #4470
- [Eagle] Remove the greedy branch and some redundant code by @Ying1123 in #4363
- Support FlashMLA backend by @sleepcoo in #4472
- fix custom allreduce performance/accuracy problem by @yizhang2077 in #4477
- 400 on empty input_ids by @yinghai in #4481
- Update CODEOWNERS by @merrymercy in #4484
- Statistical Analysis of the Output Stability of the Deepseek Model by @tanzelin430 in #4202
- model: support gemma-3-it by @mickqian in #4424
- Initialize image processor for skip-tokenizer-init codepath by @yinghai in #4479
- Fix: modelscope env comment by @huiwq1990 in #4474
- Fix: Complete int32 to int64 conversion by @xiezhq-hermann in #4465
- [ROCm] enable moe topk softmax in amd by @yiakwy-xpu-ml-framework-team in #4448
- Feat/support code completion by @woodx9 in #3612
- Add endpoint for file support, purely to speed up processing of input_embeds. by @RinRin-32 in #2797
- Set xgrammar as the default grammar backend by @minleminzui in #4386
- Fix router test by @ByronHsu in #4483
- [Fix] use
torch.inference_mode()instead oftorch.no_grad()by @Alcanderian in #4372 - [Feature] Support Deepseek-VL2 by @ccw1996 in #2798
- config: Update fused moe config by @mickqian in #4493
- Support serving DeepSeek-R1-Channel-INT8 with 32 L40S. by @solrex in #4418
- Support Online Quantization for W8A8 by @hebiao064 in #4485
- Tool call with text by @xihuai18 in #4067
- Nicer standalone engine inferface by @yinghai in #4480
- [Fix] Resolve GPU Memory Leak in update_weights_from_tensor by @U-rara in #4446
- [Doc] add doc for quantization w8a8_fp8 or w8a8_int8 by @HandH1998 in #4495
- Fix data parallel + tensor parallel by @merrymercy in #4499
- [ROCm] fix dtype by @yiakwy-xpu-ml-framework-team in #4510
- Remove redundant type conversion by @merrymercy in #4513
- Update readme by @merrymercy in #4517
- [sgl-router] improvement to avoid hang by @yinghai in #4482
- Revert "feat: update grouped_topk to support softmax and sigmoid" by @ispobock in #4505
- bump v0.0.5.post3 by @zhyncs in #4520
- upgrade sgl-kernel 0.0.5.post3 by @zhyncs in #4522
- sglang quant module remove vllm dependency by @BBuf in #4507
- Unit test for Hierarchical Caching by @xiezhq-hermann in #4486
- refactor: rewrite bench-mmmu-sglang by @mickqian in #4458
- fix: second_per_grid_ts should be used to get mrope position by @mickqian in #3682
- [Hotfix] solve fp8 w8a8 ci test fail by @BBuf in #4531
- remove useless backend forward in rotary_embedding by @BBuf in #4500
- Fix the incorrect args in benchmark_and_profiling.md by @tianyuzhou95 in #4542
- cleanup deps 3/n by @zhyncs in #4541
- Add deepseek v2 torch compile pr test by @ispobock in #4538
- use sgl custom all reduce by @zhyncs in #4441
- [Fix] Type annotation correction for UpdateWeightsFromTensorReqInput by @U-rara in #4532
- [Feature] Support EAGLE 3 by @chromecast56 in #4247
- Reduce computation and communication in DP attention by @ch-wan in #4521
- [Feature] Support Tensor Parallelism and Weight Slicing for Lora by @aoshen524 in #4274
- Optimize Triton decoding kernel for dynamic workload by @Alcanderian in #4553
- [Fix] Fix raw_bs bug when using flashinfer mla and eagle by @Fridge003 in #4557
- Create col-major and tma-aligned x_scale for deep_gemm.gemm_fp8_fp8_bf16_nt by @strgrb in #4515
- [Feature] Integrate DeepEP into SGLang by @liz-badada in #4232
- Support FlashMLA backend cuda graph by @sleepcoo in #4514
- Add clang-format to pre-commit config by @Hongbosherlock in #4583
- [fix] fix initialization of _ENABLE_TORCH_INFERENCE_MODE by @Alcanderian in #4549
- avoid cudaStreamSynchronize in DeepSeekV2AttentionMLA by @strgrb in #4577
- Support
nin OpenAI API completions by @ChuyueSun in #3446 - [fix] fix illegal mem access and clean up triton attention backend by @Alcanderian in #4571
- Enable setting sglang logger from Env Variable
SGLANG_LOGGING_CONFIG_PATHby @guoyuhong in #4592 - Update doc for MTP and DP attention by @ispobock in #4622
- Support fp8 gemm for blackwell by @wenscarl in #4558
- fix SUPPORT_CUTLASS_BLOCK_FP8 flag by @ch-wan in #4640
- Set deepgemm to the default value in the hopper architecture. by @sleepcoo in #4613
- [docs] Add links and fix grammars in deploy_on_k8s.md by @windsonsea in #4641
- Align completion and chat_completion response to OpenAI API by @guoyuhong in #4637
- [PD] Release initial code by @ByronHsu in #4654
- fix: fix ipython running error for Engine due to outlines nest_asyncio by @minleminzui in #4582
- update news for README by @zhyncs in #4664
- Speed up per token and per tensor quant by 15% by @zcnrex in #4639
- [quantization] fix channelwise conversion with scalar weight scale by @yundai424 in #4596
- Correcting default configuration when benchmarking fused_moe by @penguin-wwy in #4665
- [1/3] fix dsv3 awq issue by @AniZpZ in #4556
- [Docs] Update docs for gemma3 and VLM chat templates by @adarshxs in #4674
- [CI fix] test skipping modelopt on AMD by @adarshxs in #4677
- fix flaky ut by @zhyncs in #4670
- Add EAGLE mtbench benchmark script by @ispobock in #4676
- Bug fix for metrics counter by @xiezhq-hermann in #4660
- [Bug Fix] Add partial rotary factor support for Phi-4 and upgrade to transformers v4.50.0 by @adarshxs in #3984
- Optimize Permute Kernel in DeepEP by @xutizhou in #4643
- fix typo SGLang supports three grammar backends by @BroadbentJim in #4679
- close gemma2 in test_verl_engine.py temporarily by @yizhang2077 in #4685
- Multiple tiny code cleanups by @fzyzcjy in #4608
- Support async in DeepEP by @fzyzcjy in #4610
- refactor: bug fixes and refactor for vlm by @mickqian in #4661
- Move mem_state update into debug mode by @xiezhq-hermann in #4525
- Fix RotaryEmbedding when using Triton backend for EXAONE-3.5-2.4B by @lkm2835 in #4064
- Unify variable naming: replace is_in_free_group with is_not_in_free_group by @c1lovez1 in #4698
- [ROCm] Enable MTP (NextN) on AMD GPU by @alexsun07 in #4631
- Support FA3 as Attention backend by using
--attention-backend fa3by @hebiao064 in #4680 - rename benchmark_deepgemm_fp8_group_gemm.py by @tbzhang in #4605
- [Quant Kernel] refactored per token group quant fp8 to support int8 up-to 2x faster by @zcnrex in #4396
- Support dynamic version name in sglang's pyproject.toml by @guoyuhong in #4720
- update pyproject by @zhyncs in #4731
- [PD] Remove invalid parameter by @XucSh in #4721
- Fix EAGLE3 for llama3.3 70b by @ispobock in #4716
- Fix circular imports in gptq.py and unblock test explorer by @hebiao064 in #4736
- [Model] Support Qwen2ForSequenceClassification by @Ximingwang-09 in #4609
- Support FP4 gemm (1/2) by @trevor-m in #3899
- Add DeepEP tests into CI by @fzyzcjy in #4737
- model: Minicpmo by @mickqian in #3023
- support cu128 sgl-kernel by @zhyncs in #4744
- [Benchmark] tilelang vs deepgemm vs w8a8_block_fp8_matmul by @zcnrex in #4735
- Super tiny fix typo by @fzyzcjy in #4738
- fix FlashMLA cudagraph config by @sleepcoo in #4691
- Speedup warmup when DP > 1 by @fzyzcjy in #4695
- Add endpoints to dump selected expert ids by @yuhsuan-t in #4435
- add dsv3 int8 test by @HandH1998 in #4705
- [Feature] Support "strict" in function calling by @DarkSharpness in #4310
- Revert "Add DeepEP tests into CI (#4737)" by @fzyzcjy in #4751
- Fix test_expert_distribution failure by @fzyzcjy in #4752
- Fix warmup error when dp=1 by @fzyzcjy in #4753
- Add retry for flaky tests in CI by @fzyzcjy in #4755
- [Fix] Fix unexpected idx bug of Phi-3-small by @Fridge003 in #4728
- Warn users when release_memory_occupation is called without memory saver enabled by @fzyzcjy in #4566
- fix(typo): fix
replytoreplayinbase_attn_backend.pyby @Thysrael in #4784 - Support recording experts workload in QWen2-MoE by @ch-wan in #4775
- Fix popen_launch_server wait for 20 minutes when child process exits by @fzyzcjy in #4777
- Use metadata to detect version of package by @kebe7jun in #4782
- Fix shared memory OOM on sm86 GPUs. by @Conless in #4797
- Support compressed tensors fp8w8a8 by @BBuf in #4743
- bump v0.4.4.post2 by @zhyncs in #4669
- [3/3] fix dsv3 awq issue by @laixinn in #4719
- Update supported_models.md: adding open-r1 Olympic Code 32B by HuggingFace by @didier-durand in #4628
- Align finish reason and stream mode in openai api by @xihuai18 in #4388
- support clip embedding model by @Titan-p in #4506
- update xgrammar 0.1.17 by @zhyncs in #4804
- Patch PyTorch's bug that cross-process tensor transfer will lead to wrong device by @fzyzcjy in #4565
- [FA3 Attn Backend] Remove Unnecessary Device Sync for FA3 by @hebiao064 in #4745
- support cmake for sgl-kernel by @zhyncs in #4706
- Use apply_rope_with_cos_sin_cache_inplace for DeepSeek by @strgrb in #4764
- Fix ut mla-test-1-gpu-amd by @strgrb in #4813
- Remove Unintended Capture Batch Sizes in AMD HIP Graph Runner by @gmlwns2000 in #4638
- [k8s] Clarified the usage of shared memory. by @jsuchome in #4341
- gemma3: impl
get_attention_sliding_window_sizefor attn init by @vhain in #4823 - add partial_json_parser and einops by @zhyncs in #4827
- fix the release doc dependency issue by @zhyncs in #4828
- Update doc for DeepSeek-V3-0324 by @ispobock in #4825
- deps: lazy import optional dependencies
ggufandtorchvisionby @vhain in #4826 - Update MMMU Benchmark instructions by @ravi03071991 in #4694
- Fix the nightly eval by lowering the threshold of
neuralmagic/gemma-2-2b-it-FP8by @merrymercy in #4830 - Basic Cleanup by @danielholanda in #4833
- Support (1 <= dp < tp) in the dp attention in DeepEP by @tarinkk in #4770
- [Fix] Add compressed_tensors as deps by @ocss884 in #4819
- Fix error due to CustomAllreduce setup failure by @kebe7jun in #4815
- use default for torch.ops by @zhyncs in #4835
- [CI] Remove unused imports with Ruff to pre-commit config, only to benchmarks/docs/examples folder by @b8zhong in #3969
- [Misc] Fix issues reported by torchfix by @b8zhong in #4837
- Include context length in /v1/models response. by @jondurbin in #4809
- [Fix]
self.workerassignment inTpModelWorkerand refactor references by @JustinTong0323 in #4788 - Fix the lora adapter when lora path is none by @Qiaolin-Yu in #4799
- fix: fix typo of comments in w8a8_fp8.py by @ZhuJiaqi9905 in #4843
- Remove retry in nightly tests by @fzyzcjy in #4846
- Fix CI of test_patch_torch by @fzyzcjy in #4844
- IPv6 support by @vincent-4 in #3949
- ci: add condition for daily docker build by @warjiang in #4487
- [Fix] fix output_top_logprobs is not exist by @lambert0312 in #4597
- fix: when use SGLANG_PORT this env,port is str by @lengrongfu in #4528
- Support Page Size > 1 for FA3 by @hebiao064 in #4832
- Fix Engine error when enabling DP attention by @fzyzcjy in #4648
- fix: Inappropriate lack of Optional type on OpenAI ChatCompletionRequest by @BroadbentJim in #4681
- Support controlling nsys start and end range programmatically by @fzyzcjy in #4688
- Remove empty tool function name by @kebe7jun in #4704
- Fix missing arguments in SchedulePolicy and RadixCache initialization in tests. by @vshekhawat-hlab in #4712
- get the python version from env by @DavidChan0519 in #4729
- Fix torch.cuda.MemPool() internal assertion failure by @fzyzcjy in #4687
- Super tiny remove unused code by @fzyzcjy in #4750
- Support with_stack and record_shapes in profiler by @fzyzcjy in #4740
- test: reduce
mem_fraction_staticfor gemma3 vision test by @vhain in #4840 - Fix CI tests by @merrymercy in #4853
- Fix fa3 cuda graph page_size > 1 precision and page_size=1 speed by @qingquansong in #4855
- Revert "get the python version from env (#4729)" by @zhyncs in #4863
- [Feature] add multi-rank support for Lora by @jcbjcbjc in #4492
- Clean up
import vllmin quantization/init.py by @merrymercy in #4834 - Fix wrong variable name when stopping memory profile by @Fr4nk1inCs in #4772
- [Feat] support deepgemm for cmake by @yinfan98 in #4864
- Make torch compile configurable for biased_grouped_topk by @qingquansong in #4749
- update sgl-kernel test ci by @zhyncs in #4866
- fix sampling issue by @zhyncs in #4871
- bump sgl-kernel 0.0.5.post4 by @zhyncs in #4768
- fix sgl-kernel cu118 build by @zhyncs in #4872
- [Feature] Support FA3 backend for MLA by @Fridge003 in #4831
- upgrade sgl-kernel 0.0.5.post4 by @zhyncs in #4873
- update torch compile doc by @ispobock in #4874
- bump v0.4.4.post3 by @zhyncs in #4878
- Fix BadRequestError wrong arguments and remove openai dependency by @fzyzcjy in #4882
- Improve stack trace of retry errors by @fzyzcjy in #4845
- Tiny fix doc error by @fzyzcjy in #4795
- [Docs] Update DeepGemm at README.md by @yinfan98 in #4886
- Update CODEOWNERS by @zhyncs in #4889
- Delete test_deep_gemm.py by @yinfan98 in #4891
- Add deepseek style fused moe group gate selection kernel by @qingquansong in #4530
- quick fix: add default for new kernel by @yinfan98 in #4898
- remove setup for sgl-kernel by @zhyncs in #4899
- [Misc] Clean m.def and add Development Tips by @yinfan98 in #4890
- fix allreduce test by @yizhang2077 in #4909
- Support page size > 1 + eagle by @merrymercy in #4908
- Fix retract for page size > 1 by @merrymercy in #4914
- [Feature] use pytest for sgl-kernel by @adarshxs in #4896
- fix bmm fp8 by @zhyncs in #4926
- Fix the timeout for unit-test-2-gpu in pr-test.yml by @merrymercy in #4927
- Fix 2-gpu CI test and suppress some warnings by @merrymercy in #4930
- [feat] add fa3 in sgl-kernel by @yinfan98 in #4902
- Fix sglang frontend's incorrect dependency on torch by @seplos in #4931
- [Fix] avoid stream sync and torch compile in prefill for fa3 backend by @Fridge003 in #4932
- cleanup sgl-kernel by @zhyncs in #4933
- [Fix] Improve Lora tests and reduce CI runtime by @Fridge003 in #4925
- Fix DeepSeek bug causing 2.2% MMLU drop when TP!=DP by @fzyzcjy in #4883
- [Fix] Add torch compile for torch.clamp back by @Fridge003 in #4936
- Fix oom error for large page size by @xiezhq-hermann in #4913
- [feat] interface for platforms abstraction by @Alcanderian in #4928
- [Fix] revert clean m.def for cudagraph by @yinfan98 in #4944
- refactor: multimodal data by @mickqian in #4754
- bump sgl-kernel v0.0.6 by @zhyncs in #4950
- [Build] Fix cuda12.8 build error in nvfp4_scaled_mm_kernels.cu by @guoyuhong in #4953
- use fa3 in sgl-kernel by @zhyncs in #4954
- Revert PR 4764 & 4813 related to R1 RoPE by @guoyuhong in #4959
- [Feature] Support DeepEP Low Latency by @liz-badada in #4767
- update bench_serving by @zhyncs in #4958
- Prevent memory leak of retract_decode when page_size > 1 by @xiezhq-hermann in #4977
- [VLM RLHF] Take Image input for verl vlm rollout by @JustinTong0323 in #4915
- Large page size aligned hierarchical caching by @xiezhq-hermann in #4581
- bug fix for hicache host eviction by @xiezhq-hermann in #4989
- sgl scaled_fp8_quant support output padding by @BBuf in #4861
- Add Eagle Speculative Decoding to FA3 Backend by @qingquansong in #4951
- Update tokenizer_manager.py by @yangky11 in #5008
- [sgl-kernel] per token group quant support COLUMN MAJOR by @BBuf in #4817
- update cutlass tag by @xiezhq-hermann in #5011
- Feature/revise docs ci by @renxinx in #5009
- fix: fix illegal cuda memory access at fused_moe_kernel by @saltyfish66 in #4727
- [Build] Support build sgl-kernel with ccache by @guoyuhong in #5020
- fix deepgemm as well by @xiezhq-hermann in #5030
- try to fix ci oserror by @BBuf in #5024
- Replace enable_flashinfer_mla argument with attention_backend by @Fridge003 in #5005
- Small refactor DeepEPMode to clean up code a bit by @fzyzcjy in #4992
- [Fix] fix fa3 build at cu118 by @yinfan98 in #5036
- Revert "Replace enable_flashinfer_mla argument with attention_backend" by @merrymercy in #5048
- bump sgl-kernel v0.0.7 by @zhyncs in #5046
- update eagle-3 docs by @simveit in #4796
- Add LlavaLlamaForCausaLM in MultiModal Processors by @ravi03071991 in #5039
- Update the retry count by @zhyncs in #5051
- upgrade sgl-kernel v0.0.7 by @zhyncs in #5049
- [2/3] fix dsv3 awq issue by @AniZpZ in #4625
- Feature/revise docs ci by @renxinx in #5056
- Add H20 fused MoE kernel tuning configs for DeepSeek V3/R1 by @M0gician in #5057
- [fix] remove
cuda_device_count_statelessby @Alcanderian in #5060 - Small refactor DeepEPDispatcher into subclasses by @fzyzcjy in #4994
- Support async DeepEP by splitting into two stages by @fzyzcjy in #4995
- Cleanup unused resources after DeepEP operation by @fzyzcjy in #4996
- Add DeepSeek V3/R1 shared experts fusion by @BBuf in #4918
- [deepep] fix: shared experts are not initialized when shared experts fusion is disabled by @ch-wan in #5072
- fix dummy-load deepseekv2 by @inkcherry in #4535
- support sgl-kernel on blackwell by @zhyncs in #5074
- FA3 Spec Decoding to support top k = 1 and add cuda graph support by @hebiao064 in #5050
- [Revision] Replace enable_flashinfer_mla argument with attention_backend by @Fridge003 in #5052
- upgrade transformers 4.51.0 by @zhyncs in #5088
- sgl-kernel transfer custom allreduce from trt kernel to vllm kernel by @yizhang2077 in #5079
- bump sgl-kernel 0.0.8 by @zhyncs in #5089
- python transfer custom allreduce from trt kernel to vllm kernel by @yizhang2077 in #5080
- bump v0.4.4.post4 by @zhyncs in #5091
- Fix: Reduce the number of document ci attempts to avoid long ci running by @minleminzui in #5097
- Add Llama4 support by @CatherineSue in #5092
- Fix refactor error - fp8.py by @HaiShaw in #5106
- bump v0.4.5 by @zhyncs in #5117
New Contributors
- @DellCurry made their first contribution in #3964
- @lausannel made their first contribution in #4427
- @JiangJiaWei1103 made their first contribution in #4453
- @xu-song made their first contribution in #4454
- @yinghai made their first contribution in #4481
- @tanzelin430 made their first contribution in #4202
- @huiwq1990 made their first contribution in #4474
- @woodx9 made their first contribution in #3612
- @ccw1996 made their first contribution in #2798
- @solrex made their first contribution in #4418
- @U-rara made their first contribution in #4446
- @tianyuzhou95 made their first contribution in #4542
- @chromecast56 made their first contribution in #4247
- @strgrb made their first contribution in #4515
- @liz-badada made their first contribution in #4232
- @Hongbosherlock made their first contribution in #4583
- @guoyuhong made their first contribution in #4592
- @wenscarl made their first contribution in #4558
- @penguin-wwy made their first contribution in #4665
- @xutizhou made their first contribution in #4643
- @BroadbentJim made their first contribution in #4679
- @lkm2835 made their first contribution in #4064
- @c1lovez1 made their first contribution in #4698
- @alexsun07 made their first contribution in #4631
- @tbzhang made their first contribution in #4605
- @XucSh made their first contribution in #4721
- @yuhsuan-t made their first contribution in #4435
- @Thysrael made their first contribution in #4784
- @Conless made their first contribution in #4797
- @gmlwns2000 made their first contribution in #4638
- @jsuchome made their first contribution in #4341
- @danielholanda made their first contribution in #4833
- @tarinkk made their first contribution in #4770
- @ocss884 made their first contribution in #4819
- @b8zhong made their first contribution in #3969
- @jondurbin made their first contribution in #4809
- @JustinTong0323 made their first contribution in #4788
- @ZhuJiaqi9905 made their first contribution in #4843
- @vincent-4 made their first contribution in #3949
- @warjiang made their first contribution in #4487
- @lengrongfu made their first contribution in #4528
- @jcbjcbjc made their first contribution in #4492
- @Fr4nk1inCs made their first contribution in #4772
- @seplos made their first contribution in #4931
- @yangky11 made their first contribution in #5008
- @renxinx made their first contribution in #5009
- @saltyfish66 made their first contribution in #4727
- @inkcherry made their first contribution in #4535
Full Changelog: v0.4.4...v0.4.5