Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Refactor of L0_backend_python and the env subtest #7838

Draft
wants to merge 3,470 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
3470 commits
Select commit Hold shift + click to select a range
87165b2
Use current time when overwriting model configuration. (#6727)
whoisj Jan 17, 2024
7b06a37
Added docs for otel context propagation (#6804)
oandreeva-nv Jan 18, 2024
b6e017e
Fix typos in trace.md (#6808)
rmccorm4 Jan 18, 2024
3e79b2a
Fix test_model_config_overwite in L0_lifecycle (#6818)
GuanLuo Jan 19, 2024
7edeb9f
Improve L0_backend_python on shm reliability (#6803)
kthui Jan 19, 2024
3bff367
Remove boost::filesystem (#6810)
rmccorm4 Jan 22, 2024
bc71da0
Generate unittest xml reports from L0_python_api (#6822)
rmccorm4 Jan 23, 2024
6192c6e
Add unit test reports to L0_json, L0_metrics, L0_response_cache, L0_b…
rmccorm4 Jan 25, 2024
a514a05
Update trace summary script (#6758)
pskiran1 Jan 25, 2024
28f497c
Add gsutil upload retry helper function (#6817)
kthui Jan 25, 2024
ddfdb2a
Add test for shutdown while unloading in background (#6835)
kthui Jan 27, 2024
56e4232
Handle 0 dimension output for generate endpoint (#6833)
krishung5 Jan 29, 2024
d98a59c
tensorrt-llm benchmarking test (#6771)
pskiran1 Jan 29, 2024
2309bce
Update README.md and versions post-24.01 (#6847)
mc-nv Jan 30, 2024
d0e2653
Use libmamba solver for L0_backend_python env test. Fix pytest not fo…
krishung5 Jan 30, 2024
f92732d
Add test for shutdown while loading model (#6837)
kthui Jan 31, 2024
776e641
Adding OpenTelemetry Batch Span Processor (#6842)
oandreeva-nv Feb 1, 2024
b0a495a
Support Double-Type Inference Request/Response Parameters (#6755)
fpetrini15 Feb 1, 2024
508929a
Updating vllm version to 0.3.0 (#6858)
oandreeva-nv Feb 7, 2024
738c98f
Python Backend Windows Support (#6830)
fpetrini15 Feb 8, 2024
3d79568
Add support for Oracle Cloud in deploy (#6850)
bruno-garbaccio Feb 9, 2024
1df73dc
Add link to TRTLLM metrics docs (#6874)
rmccorm4 Feb 13, 2024
4294cc6
Add unit test reports to L0_dlpack_multi_gpu and L0_warmup (#6873)
krishung5 Feb 14, 2024
f078bfb
Set OV version to 2023.3.0 (#6880)
kthui Feb 14, 2024
80fc56c
Fixing StringTo uint32_t used only by tracing (#6883)
oandreeva-nv Feb 14, 2024
8a2a229
Update 'main' to track development of 2.44.0 / 24.03 (#6892)
mc-nv Feb 16, 2024
59e267f
Add response statistics (#6869)
kthui Feb 17, 2024
21a7fc5
Fix busyop test for L0_memory_growth (#6900)
krishung5 Feb 22, 2024
60872b9
Add cancellation into response statistics (#6904)
kthui Feb 23, 2024
8d8b607
Install required pip pkgs (#6906)
krishung5 Feb 24, 2024
adafa4f
Match forward headers case insensitively. (#6889)
yinggeh Feb 27, 2024
551978b
Add note on --cache-config spacing and fix typos (#6929)
rmccorm4 Mar 1, 2024
246f46c
Remove ignore files that are not in use by repository (#6893)
mc-nv Mar 2, 2024
1dcf2cf
Update README and versions for 2.43.0 / 24.02 (#6886)
mc-nv Feb 15, 2024
9be77f1
Set ONNX Runtime version 1.17.2
mc-nv Mar 1, 2024
19b02a2
Expose tritonserver args in values.yaml (#5582)
okyspace Mar 4, 2024
d0f332b
Parameterize git repository (#6934)
nv-kmcgill53 Mar 6, 2024
c2299d5
Enhance bound check for shm offset (#6914)
kthui Mar 8, 2024
110251b
Allow non-decoupled model to send response and FINAL flag separately …
GuanLuo Mar 8, 2024
25266a5
Add test for max queue delay timeout prompt response (#6938)
kthui Mar 8, 2024
b012bd0
Test improved input validation errors (#6933)
indrajit96 Mar 9, 2024
52a1cd2
Update Dockerfile.sdk with OpenAI support (#6941)
tgerdesnv Mar 11, 2024
b2e6e7e
Test Correlation Id string support for BLS (#6963)
pskiran1 Mar 11, 2024
9786e40
Update 'main' to track development of 2.45.0 / 24.04 (#6974)
mc-nv Mar 11, 2024
e92abf2
Add AsyncIO HTTP compression test (#6975)
kthui Mar 13, 2024
8139431
Install `genai-pa` into SDK container (#6942)
mc-nv Mar 13, 2024
5c6e487
extend existing tests with more parameters (#6951)
yf711 Mar 15, 2024
9f16eef
Exposing trace context to python backend (#6985)
oandreeva-nv Mar 15, 2024
8b36aa8
Add documentation for mapping between Triton Errors and HTTP status c…
Tabrizian Mar 19, 2024
afaa6f4
Remove hatch version (#7009)
tgerdesnv Mar 21, 2024
fdbfb27
Update vLLM to 0.3.2 for gemma support (#6918)
kebe7jun Mar 21, 2024
2be127b
Add missing copyright for L0_trace (#6996)
oandreeva-nv Mar 25, 2024
df753d7
fix sphinx warnings (#7030)
yinggeh Mar 25, 2024
a844eda
Add meetup invite banner (#7049)
rmccorm4 Mar 27, 2024
8a208d7
Update 'main' post-24.03 (#7051)
mc-nv Apr 1, 2024
1dfa33d
Fix incorrect version updates (#7073)
Tabrizian Apr 4, 2024
879a505
Update compose.py and remove mention of tensorflow1 in documentation …
jbkyang-nvi Apr 4, 2024
e9e3648
Add testing for iterative scheduler backlogged requests (#7059)
Tabrizian Apr 5, 2024
dbeb198
Remove conda package manager (#7069)
mc-nv Apr 5, 2024
e1d58c7
fix link (#7044)
yinggeh Apr 5, 2024
74660f1
Add Documentation from Additional Repositories to nvidia.docs.com (#7…
yinggeh Apr 5, 2024
2150fc2
Fix html image rendering in sphinx documentation (#7084)
tanmayv25 Apr 8, 2024
cbd6967
Remove obsolete mention of image tags (#7085)
tanmayv25 Apr 9, 2024
aff4b93
HTTP live connections on server shutdown (#6986)
kthui Apr 9, 2024
10f1c8d
Enable autodocs for python client library API documentation (#7082)
tanmayv25 Apr 9, 2024
5e20ef6
Updated vllm version (#7095)
oandreeva-nv Apr 10, 2024
52f97b5
Disable Dynamic Log File (#7092)
yinggeh Apr 11, 2024
159b060
Validate system shared memory region size when registering a region (…
rmccorm4 Apr 11, 2024
196caf0
Decoupled Async Execute (#7062)
kthui Apr 11, 2024
5b739db
Add trace mode and trace config entries in trace settings API (#7050)
indrajit96 Apr 11, 2024
0a4c87b
Update 'main' to track development of 2.46.0 / 24.05 (#7105)
mc-nv Apr 11, 2024
3b6c6f9
Validate the memory requested for the infer request is not out of bou…
jbkyang-nvi Apr 12, 2024
b889687
Add copyright for tritonclient_api (#7109)
Tabrizian Apr 12, 2024
7529f0e
Disable dynamic trace file (#7106)
yinggeh Apr 13, 2024
e116a2a
Update L0_logging to reflect error when trying to update log_file (#7…
yinggeh Apr 13, 2024
8e88f2c
Add new cached channel test (#7123)
jbkyang-nvi Apr 17, 2024
e965287
Fix gRPC frontend race condition (#7110)
kthui Apr 17, 2024
233c4b2
Remove client testing of server trace to match discontinued support f…
matthewkotila Apr 17, 2024
2de09ee
Re-enable PA trace testing but remove setting trace file (#7131)
matthewkotila Apr 19, 2024
dba31c2
Fix windows build for shared memory bound checking(#7137)
jbkyang-nvi Apr 19, 2024
09b34be
Fix test for cached channels (#7130)
jbkyang-nvi Apr 19, 2024
1da454c
Use a lower concurrency with more repetition for L0_memory_growth (#7…
krishung5 Apr 23, 2024
f243276
Replace deprecated tritongrpcclient package (#7061)
Tabrizian Apr 24, 2024
365b86a
Avoid the HTTP Error 403: rate limit exceeded error (#7155)
krishung5 Apr 25, 2024
987deaa
Clarify instance group documentation for ensemble (#7162)
Tabrizian Apr 25, 2024
d432266
Add extra footer to documentation (#7163)
mc-nv Apr 26, 2024
5239ff0
Add metrics model namespacing label test (#7141)
kthui Apr 26, 2024
16e5470
Update `main` post-24.04 (#7160)
mc-nv Apr 30, 2024
3c99c95
Remove meetup note now that the event has completed (#7179)
Tabrizian May 3, 2024
a9d3dac
Validate CUDA SHM region registration size (#7178)
krishung5 May 7, 2024
ee6d238
Fix python client Shm Leak (#7172)
fpetrini15 May 7, 2024
c724193
Add test for sequence state after cancellation (#7167)
kthui May 7, 2024
27c2142
Rename triton_tensorrtllm_worker -> trtllmExecutorWorker (#7194)
krishung5 May 8, 2024
884ca4e
Tests for Top Level Request Caching for Ensemble Models (#7074)
lkomali May 9, 2024
6694b74
Test cuda shared memory offset and byte size out of bounds(#7202)
jbkyang-nvi May 10, 2024
dd71d3b
Upgrade the golang version to 1.22.3 (#7208)
tanmayv25 May 13, 2024
a669145
Update 'Dockerfile' Python path to include DALI (#7216)
mc-nv May 14, 2024
4dcda7f
Remove the dependency on CUDA driver (#7224)
krishung5 May 15, 2024
d6fe6e6
Multiple Model Configurations (#7185)
yinggeh May 16, 2024
d356d6e
Fix L0_backend_python iGPU PyTorch installation (#7231)
kthui May 16, 2024
747f5d4
Fix the L0_simple_go_client (#7239)
tanmayv25 May 17, 2024
0370485
Add section on ensemble model caching (#7234)
rmccorm4 May 18, 2024
3e97828
Add testing for escaped log messages
nnshah1 May 20, 2024
9faf444
updating log parsing in test
nnshah1 May 21, 2024
620f095
Add documentation on logging formats
nnshah1 May 21, 2024
0c4228c
Return an error if --load-model is specified without explicit model c…
rmccorm4 May 22, 2024
1322225
Exclude Jax example from Python 3.8 (#7260)
krishung5 May 23, 2024
2d2c0b5
add test for shape validation (#7195)
jbkyang-nvi May 24, 2024
9cfc53a
Enhance OTEL testing to capture and verify Cancellation Requests and …
indrajit96 May 24, 2024
60a06bf
Fix Python 3.11 env (#7274)
krishung5 May 28, 2024
729b677
Bump vllm to v0.4.2 (#7198)
kebe7jun May 29, 2024
ea095c9
Update main to track development for 2.47.0 / r24.06 (#7291)
tanmayv25 May 29, 2024
20f3487
Update 'main' post 24.05 release (#7298)
tanmayv25 May 29, 2024
c907231
Update openvino to 2024.0.0 (#7299)
krishung5 May 30, 2024
c3eb5ca
docs: Update PR templates (#7290)
jbkyang-nvi May 30, 2024
13f819b
docs: Add default template that diverts to sub templates (#7306)
jbkyang-nvi May 30, 2024
d189a87
Added new flag for GPU peer access API control (#7261)
indrajit96 Jun 3, 2024
4d113dc
build: Update vllm version to v0.4.3 (latest) (#7309)
oandreeva-nv Jun 3, 2024
6a303f8
fix: Fix L0_input_validation--base (#7304)
yinggeh Jun 4, 2024
34390d7
fix: Remove onnxruntime libraries from system path (#7323)
tanmayv25 Jun 5, 2024
b0ea306
Change TensorRT-LLM (#7143)
mc-nv Jun 5, 2024
b6734dd
Add testing for libtorch cudnn (#7286)
Tabrizian Jun 5, 2024
31f00b6
Fix gRPC streaming non-decoupled segfault if sending response and fin…
kthui Jun 6, 2024
497475e
Add support for response sender in the default mode (#7311)
kthui Jun 6, 2024
8ce3890
fix: Handling grpc cancellation edge-case:: Cancelling at step START …
oandreeva-nv Jun 6, 2024
8745160
test: Add testing for CUDA EP options (#7328)
krishung5 Jun 6, 2024
03ca720
ci: Support BF16 data type in TensorRT backend (#7310)
pskiran1 Jun 7, 2024
c0e4c81
test: Update error messages to comply with core change (#7326)
yinggeh Jun 7, 2024
7236796
ci: Restrict numpy to version 1.x (#7327)
KrishnanPrash Jun 7, 2024
3135eb5
test: Fix the test to expect updated error messages (#7340)
tanmayv25 Jun 12, 2024
fe63eba
test: Python models filtering outputs based on requested outputs (#7338)
kthui Jun 12, 2024
5f8497f
test: Add test for sequence flags in ensemble streaming inference (#7…
indrajit96 Jun 12, 2024
fd1d9c4
fix: Fix version for setuptools and grpcio-tools. Remove cudnn 8 inst…
krishung5 Jun 18, 2024
f326993
ci: Add INT64 Datatype Support for Shape Tensors in TensorRT Backend …
pskiran1 Jun 20, 2024
9e55dab
Update 15-container-copyright.txt (#7375)
Tabrizian Jun 26, 2024
0f4c9d3
Update `main` post -24.06 (#7380)
mc-nv Jun 28, 2024
686cf1a
test: Add input byte size tests using C APIs (#7372)
yinggeh Jul 3, 2024
33d7e7e
[refactor]: Refactor Frontend Trace OpenTelemetry Implementation (#7390)
oandreeva-nv Jul 5, 2024
65a9140
[fix]: grpc state cleanup fix (#7409)
oandreeva-nv Jul 5, 2024
4415430
[build]: vllm version update (#7405)
oandreeva-nv Jul 5, 2024
8c5b94c
[feat]:Custom Backend Tracing (#7403)
oandreeva-nv Jul 5, 2024
66e4fff
build: Reduce intermediate layers (#7408)
krishung5 Jul 8, 2024
e9b811c
test: Remove AWS bucket on test failure (#7342)
kthui Jul 8, 2024
dabb7cb
fix: Fix error message for L0_trt_compat (#7432)
krishung5 Jul 10, 2024
2f299d1
feat: Support for request id field in generate API (#7392)
shreyas-samsung Jul 10, 2024
22d9261
perf: Improve response throughput of a single gRPC stream (#7404)
kthui Jul 12, 2024
b263bfc
test: Tests for Metrics API enhancement to include error counters (#7…
indrajit96 Jul 12, 2024
3421429
Update NGC versions post-24.07 release (#7469)
pvijayakrish Jul 25, 2024
96ef8a7
[build]: Bumping vllm version to v0.5.3.post1 (#7453)
oandreeva-nv Jul 25, 2024
f151f8a
ci: Fix shape and reformat free tensor handling in the input byte siz…
pskiran1 Jul 27, 2024
b8a3629
chore: PA Migration From Client (#7449)
fpetrini15 Jul 29, 2024
5e61a01
test: Refactor cpu metrics tests to make L0_metrics more stable (#7476)
rmccorm4 Jul 29, 2024
e713208
test: Add BF16 test for python backend (#7483)
rmccorm4 Jul 30, 2024
3443dd6
test: Improve L0_logging stability (#7486)
rmccorm4 Jul 31, 2024
839faf7
ci: Return custom exit code to indicate known shm leak failure in L0_…
krishung5 Jul 31, 2024
d4b585d
Including 'tritonserver.lib' into final package (#7491)
mc-nv Aug 2, 2024
327ee02
build: Add default value for argument 'TRITON_REPO_ORGANIZATION' from…
zhanga5 Aug 5, 2024
5b33a25
chore:Purge PA from Client Repo (#7488)
fpetrini15 Aug 6, 2024
04e0d85
PA Migration: Update L0_client_build_variants (#7505)
fpetrini15 Aug 7, 2024
3c7263f
test: Add test for sending response after sending complete final flag…
kthui Aug 7, 2024
ea3ebca
Add vLLM x Triton user meetup announcement (#7509)
harryskim Aug 8, 2024
a5ad309
Fix benchmarking tests (#7461)
pskiran1 Aug 10, 2024
61466d4
feat: Add vLLM counter metrics access through Triton (#7493)
yinggeh Aug 16, 2024
cadd112
build: RHEL 8 Compatibility (#7519)
nv-kmcgill53 Aug 16, 2024
5611ca1
feat: Add GRPC error codes to GRPC streaming if enabled by user. (#7499)
indrajit96 Aug 16, 2024
6857dc3
test: Add python backend tests for the new histogram metric (#7540)
yinggeh Aug 17, 2024
c91d1e5
test: Load new model version should not reload loaded existing model …
kthui Aug 20, 2024
a7a43a2
Intermittent `L0_decoupled_grpc_error` crash fixed. (#7552)
indrajit96 Aug 20, 2024
3735d99
ci: Raise Documentation Generation Errors (#7559)
fpetrini15 Aug 22, 2024
8e56e30
docs: Add tensorrtllm_backend into doc generation (#7563)
krishung5 Aug 23, 2024
be1a0a5
build: RHEL8 EA2 Backends (#7568)
fpetrini15 Aug 27, 2024
ef6afcd
Release: Update NGC versions post-24.08 release (#7565)
pvijayakrish Aug 27, 2024
c88aec5
docs: Add python backend to windows build command (#7572)
krishung5 Aug 27, 2024
3ea493f
docs: Triton TRT-LLM user guide (#7529)
krishung5 Aug 27, 2024
01438d8
Build: Updating to allow passing DOCKER_GPU_ARGS at model generation …
pvijayakrish Aug 27, 2024
5104900
feat: Python Deployment of Triton Inference Server (#7501)
KrishnanPrash Aug 30, 2024
89a9038
fix: Adding copyright info (#7591)
KrishnanPrash Sep 3, 2024
cb1204d
test: Refactor core input size checks (#7592)
yinggeh Sep 4, 2024
8da14cc
Don't Build `tritonfrontend` for Windows. (#7599)
fpetrini15 Sep 7, 2024
9076d2c
fix: Add reference count tracking for shared memory regions (#7567)
pskiran1 Sep 11, 2024
3eab666
build/test: RHEL8 EA3 (#7595)
fpetrini15 Sep 11, 2024
e452b58
Fix: Add mutex lock for state completion check in gRPC streaming to p…
pskiran1 Sep 17, 2024
a93de16
Update fetch_models.sh (#7621)
vd-nv Sep 19, 2024
b4525aa
ci: Set stability factor to a higher value (#7634)
lkomali Sep 20, 2024
e44cf29
[docs] Removed vLLM meetup announcement (#7673)
oandreeva-nv Oct 1, 2024
fe0e41e
Update the versions post 24.09 release.
pvijayakrish Sep 25, 2024
c2fa60c
Build: Update triton version in Map (#7610)
pvijayakrish Sep 11, 2024
26a05ed
Update versions post 24.09
fpetrini15 Sep 7, 2024
b0adf31
Dockerfile.win10.min - Update dependency versions (#7633)
mc-nv Sep 24, 2024
86dbef3
Update server versions post 24.09
pvijayakrish Sep 26, 2024
1fa799e
ci: Reducing flakiness of `L0_python_api` (#7674)
KrishnanPrash Oct 2, 2024
3a21f61
[doc]Adjusted formatting of the warning (#7675)
oandreeva-nv Oct 3, 2024
1df30ed
fix: usage of ReadDataFromJson in array tensors (#7624)
v-shobhit Oct 7, 2024
9bbee48
fix: `tritonfrontend` gRPC Streaming Segmentation Fault (#7671)
KrishnanPrash Oct 7, 2024
71a285a
test: Enhance Python gRPC streaming test to send multiple requests (#…
kthui Oct 7, 2024
d6488fd
refactor: Removing `Server` subclass from `tritonfrontend` (#7683)
KrishnanPrash Oct 8, 2024
fb430c7
feat: Add copyright hook (#7666)
pranavm-nvidia Oct 8, 2024
d13235c
build: Adding `tritonfrontend` to `build.py` (#7681)
KrishnanPrash Oct 9, 2024
466fed4
feat: OpenAI Compatible Frontend (#7561)
rmccorm4 Oct 11, 2024
f9ca1b8
docs: Add beta note to OpenAI compatible API (#7695)
rmccorm4 Oct 12, 2024
c730982
fix: Fix bug when targeting the TRT-LLM backend ensemble (#7700)
blongnv Oct 16, 2024
0200d2c
test: Allow ensemble to create the final response even if some of the…
kthui Oct 16, 2024
1a54d83
test: Update server repo for some tests (#7704)
jbkyang-nvi Oct 16, 2024
2961cf8
docs: Add example outputs to OpenAI Frontend docs (#7691)
KrishnanPrash Oct 16, 2024
01e77a8
chore: Fix genai-perf command and add missing copyrights (#7710)
rmccorm4 Oct 16, 2024
aeb20a1
docs: Clarify meanings of ensemble key and value (#7711)
kthui Oct 17, 2024
dedb9e7
fix: Re-enables copyright hook, updates GitHub Action to only run pre…
pranavm-nvidia Oct 18, 2024
940aa22
fix: Fix L0_perf_nomodel shared memory (#7709)
kthui Oct 18, 2024
6f6cbe0
Change compute capablity min value (#7708)
mc-nv Oct 18, 2024
aa93b95
build: `tritonfrontend` support for no/partial endpoint builds (#7605)
KrishnanPrash Oct 18, 2024
ee198de
Revert "Change compute capablity min value (#7708)" (#7721)
mc-nv Oct 21, 2024
12b1968
test: Test and document histogram latency metrics (#7694)
yinggeh Oct 23, 2024
10d7eaa
fix: Copy models out of NFS before starting Triton to avoid intermitt…
rmccorm4 Oct 23, 2024
2f8de73
docs: Add support matrix for model parallelism in OpenAI Frontend (#7…
rmccorm4 Oct 23, 2024
dcfc6a0
test: Add L0_additional_dependency_dirs (#7707)
fpetrini15 Oct 23, 2024
128f19a
test: Add small delay to L0_lifecycle test_load_new_model_version aft…
kthui Oct 24, 2024
604b2aa
Removing caching on windows. (#7717)
mc-nv Oct 29, 2024
4453fa3
feat: Metrics Support in `tritonfrontend` (#7703)
KrishnanPrash Oct 31, 2024
284e71d
build: RHEL8 Python Backend (#7744)
fpetrini15 Oct 31, 2024
3bfacf8
chore: ensure proper clean up in shared memory related tests (#7729)
GuanLuo Oct 31, 2024
c7589f1
refactor: Include job id and nightly tag to results uploaded (#7751)
kthui Oct 31, 2024
0b724f2
Update test script for TRT compatibility test to check for
pvijayakrish Oct 27, 2024
3b4fabd
Build: Update main branch post 24.10 release (#7754)
pvijayakrish Nov 1, 2024
67c59c8
ci: Adding tests for `numpy>=2` (#7756)
KrishnanPrash Nov 1, 2024
06b358a
Reapply "Change compute capability min value (#7708)" (#7757)
mc-nv Nov 1, 2024
8941e15
build: Install tritonfrontend and tritonserver wheels by default in p…
KrishnanPrash Nov 4, 2024
1f7a516
Fix model generation (#7764)
mc-nv Nov 4, 2024
6191c67
test: Test per-model metric customization and document custom histogr…
yinggeh Nov 6, 2024
4725600
fix: Fixing pip installation as a system package (#7768)
KrishnanPrash Nov 6, 2024
5f8f07b
fix: Adding copyright support for `.pyi` files (#7769)
KrishnanPrash Nov 6, 2024
0269a3c
fix: Skip copyrights check for "expected" files in L0_model_config (#…
yinggeh Nov 7, 2024
51b304f
Update 'main' to track development of 2.53.0 / 24.12 (#7771)
mc-nv Nov 7, 2024
d2ecac1
test: OpenAI frontend invalid chat tokenizer network issue WAR (#7779)
kthui Nov 8, 2024
60f22e4
Update ONNX version for generated models (#7785)
mc-nv Nov 13, 2024
3c7a263
test: RHEL Filesystem Tests (#7788)
fpetrini15 Nov 14, 2024
66026e5
Update model generation scenario (#7793) (#7797)
mc-nv Nov 15, 2024
d4d9ebc
fix: Fix L0_input_validation (#7800)
pskiran1 Nov 19, 2024
3815390
build: Support RHEL ORT TensorRT Execution Provider (#7812)
fpetrini15 Nov 20, 2024
2eb481d
ci: modifying stat count for `L0_server_status` (#7820)
KrishnanPrash Nov 21, 2024
fb89be7
build: update build.py to pass versions as input parameter and conver…
nvda-mesharma Nov 21, 2024
16154f2
fix: Resolve integer overflow in Load API file decoding (#7787)
pskiran1 Nov 22, 2024
eb1d290
feat: Enable deferred unregistering of shared memory regions after in…
pskiran1 Nov 25, 2024
9e181b9
ci: Fix L0_cuda_shared_memory (#7832)
pskiran1 Nov 26, 2024
3ac229e
Update `main` branch post 24.11 (#7829)
mc-nv Nov 26, 2024
64b0a28
Moved shared memory tests to their own test script; Generated a hash …
nv-kmcgill53 Nov 26, 2024
293095d
subtests make use of the subtest_properties hash map
nv-kmcgill53 Nov 26, 2024
234e36b
Passing properties to the subtests
nv-kmcgill53 Nov 27, 2024
0254cea
Subtests making use of the properties passed to them
nv-kmcgill53 Nov 27, 2024
5bc5b91
Adding comment explaining the subtest_properties hash map
nv-kmcgill53 Nov 27, 2024
f6b3275
Fix failing tests for review
nv-kmcgill53 Nov 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 4 additions & 2 deletions .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
BasedOnStyle: Google

IndentWidth: 2
ContinuationIndentWidth: 2
ColumnLimit: 80
ContinuationIndentWidth: 4
UseTab: Never
MaxEmptyLinesToKeep: 2

Expand Down Expand Up @@ -34,4 +35,5 @@ BinPackArguments: true
BinPackParameters: true
ConstructorInitializerAllOnOneLineOrOnePerLine: false

IndentCaseLabels: true
IndentCaseLabels: true

24 changes: 24 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Description**
A clear and concise description of what the bug is.

**Triton Information**
What version of Triton are you using?

Are you using the Triton container or did you build it yourself?

**To Reproduce**
Steps to reproduce the behavior.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

**Expected behavior**
A clear and concise description of what you expected to happen.
20 changes: 20 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#### What does the PR do?
<!-- Describe your pull request here. Please read the text below the line, and make sure you follow the checklist.-->

#### Checklist
- [ ] I have read the [Contribution guidelines](#../../CONTRIBUTING.md) and signed the [Contributor License
Agreement](https://github.com/NVIDIA/triton-inference-server/blob/master/Triton-CCLA-v1.pdf)
- [ ] PR title reflects the change and is of format `<commit_type>: <Title>`
- [ ] Changes are described in the pull request.
- [ ] Related issues are referenced.
- [ ] Populated [github labels](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) field
- [ ] Added [test plan](#test-plan) and verified test passes.
- [ ] Verified that the PR passes existing CI.
- [ ] I ran pre-commit locally (`pre-commit install, pre-commit run --all`)
- [ ] Verified copyright is correct on all changed files.
- [ ] Added _succinct_ git squash message before merging [ref](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html).
- [ ] All template sections are filled out.
- [ ] Optional: Additional screenshots for behavior/output changes with before/after.

#### Commit Type:
Check the [conventional commit type](https://github.com/angular/angular/blob/22b96b9/CONTRIBUTING.md#type)
box here and add the label to the github PR.
- [ ] build
- [ ] ci
- [ ] docs
- [ ] feat
- [ ] fix
- [ ] perf
- [ ] refactor
- [ ] revert
- [ ] style
- [ ] test

#### Related PRs:
<!-- Related PRs from other Repositories -->

#### Where should the reviewer start?
<!-- call out specific files that should be looked at closely -->

#### Test plan:
<!-- list steps to verify feature works -->
<!-- were e2e tests added?-->

#### Caveats:
<!-- any limitations or possible things missing from this PR -->

#### Background
<!-- e.g. what led to this change being made. this is optional extra information to help the reviewer -->

#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
- closes GitHub issue: #xxx
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#### What does the PR do?
<!-- Describe your pull request here. Please read the text below the line, and make sure you follow the checklist.-->

#### Checklist
- [ ] PR title reflects the change and is of format `<commit_type>: <Title>`
- [ ] Changes are described in the pull request.
- [ ] Related issues are referenced.
- [ ] Populated [github labels](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) field
- [ ] Added [test plan](#test-plan) and verified test passes.
- [ ] Verified that the PR passes existing CI.
- [ ] Verified copyright is correct on all changed files.
- [ ] Added _succinct_ git squash message before merging [ref](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html).
- [ ] All template sections are filled out.
- [ ] Optional: Additional screenshots for behavior/output changes with before/after.

#### Commit Type:
Check the [conventional commit type](https://github.com/angular/angular/blob/22b96b9/CONTRIBUTING.md#type)
box here and add the label to the github PR.
- [ ] build
- [ ] ci
- [ ] docs
- [ ] feat
- [ ] fix
- [ ] perf
- [ ] refactor
- [ ] revert
- [ ] style
- [ ] test

#### Related PRs:
<!-- Related PRs from other Repositories -->

#### Where should the reviewer start?
<!-- call out specific files that should be looked at closely -->

#### Test plan:
<!-- list steps to verify -->
<!-- were e2e tests added?-->

- CI Pipeline ID:
<!-- Only Pipeline ID and no direct link here -->

#### Caveats:
<!-- any limitations or possible things missing from this PR -->

#### Background
<!-- e.g. what led to this change being made. this is optional extra information to help the reviewer -->

#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
- closes GitHub issue: #xxx
13 changes: 13 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Thanks for submitting a PR to Triton!
Please go the the `Preview` tab above this description box and select the appropriate sub-template:

* [PR description template for Triton Engineers](?expand=1&template=pull_request_template_internal_contrib.md)
* [PR description template for External Contributors](?expand=1&template=pull_request_template_external_contrib.md)

If you already created the PR, please replace this message with one of
* [External contribution template](https://raw.githubusercontent.com/triton-inference-server/server/main/.github/PULL_REQUEST_TEMPLATE/pull_request_template_external_contrib.md)
* [Internal contribution template](https://raw.githubusercontent.com/triton-inference-server/server/main/.github/PULL_REQUEST_TEMPLATE/pull_request_template_internal_contrib.md)

and fill it out.


84 changes: 84 additions & 0 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Copyright 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of NVIDIA CORPORATION nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

name: "CodeQL"

on:
pull_request:

jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write

strategy:
fail-fast: false
matrix:
language: [ 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support

steps:
- name: Checkout repository
uses: actions/checkout@v3

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.

# Details on CodeQL's query packs refer to:
# https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
queries: +security-and-quality


# Autobuild attempts to build any compiled languages (C/C++, C#, Go, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2

# Command-line programs to run using the OS shell.
# See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun

# If the Autobuild fails above, remove it and uncomment the following three lines.
# modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance.

# - run: |
# echo "Run, Build Application using script"
# ./location_of_script_within_repo/buildscript.sh

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
with:
category: "/language:${{matrix.language}}"
45 changes: 45 additions & 0 deletions .github/workflows/pre-commit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Copyright 2023-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of NVIDIA CORPORATION nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

name: pre-commit

on:
pull_request:

jobs:
pre-commit:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Get modified files
id: modified-files
run: echo "modified_files=$(git diff --name-only -r HEAD^1 HEAD | xargs)" >> $GITHUB_OUTPUT
- uses: actions/setup-python@v3
- uses: pre-commit/[email protected]
with:
extra_args: --files ${{ steps.modified-files.outputs.modified_files }}
28 changes: 17 additions & 11 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
/bazel-bin
/bazel-ci_build-cache
/bazel-genfiles
/bazel-trtserver
/bazel-out
/bazel-serving
/bazel-tensorflow
/bazel-tensorflow_serving
/bazel-testlogs
/bazel-tf
/bazel-workspace
/build
/builddir
/.vscode
*.so
__pycache__
tmp
*.log
*.xml
test_results.txt
artifacts
cprofile
*.prof

# Test exclusions
qa/L0_openai/openai
tensorrtllm_models
custom_tokenizer
Loading