Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apache spark images automated build pipeline #1

Merged
merged 17 commits into from
Apr 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
59eb7d8
feat(spark-base): Add new spark-base image (java/scala only) without …
idirze Mar 28, 2024
1d935f7
feat(spark): Add new spark image (java/scala only) with okdp extensio…
idirze Mar 28, 2024
ad94e07
feat(spark-py): Add new spark-py image with python basic requirements
idirze Mar 28, 2024
a9c0880
feat(spark-r): Add new spark-r image with R basic requirements
idirze Mar 28, 2024
2201f45
Build pipeline - Add basic pipeline to build and push into CI registry
idirze Mar 28, 2024
cbabf51
Build pipeline - Add scala 2.12/2.13, python and r integration tests …
idirze Mar 28, 2024
22ffda2
Build pipeline - Publish to official registry
idirze Mar 28, 2024
fc9d948
Build pipeline - Publish to official registry repo based on the lates…
idirze Mar 28, 2024
c76825c
Spark image - Remove Control M characters from spark parent pom.xml
idirze Mar 28, 2024
03828a4
Build pipeline - Add action to run CI and publish against versions an…
idirze Mar 29, 2024
f0e6088
Build pipeline - Add branch name as suffix for CI images latest tag
idirze Mar 29, 2024
fd92972
release-please - automate release/publish process and periodic images…
idirze Mar 29, 2024
980b001
fix(spark-base): Add missing gpg keys in the spark project release keys
idirze Mar 29, 2024
d81faae
Add license file
idirze Apr 2, 2024
8f55245
Update documentation
idirze Apr 2, 2024
5da5b8a
feat(spark): Minimize minio/aws sdk v1/v2 depedendencies to reduce sp…
idirze Apr 4, 2024
4796814
Prepare first official release
idirze Apr 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions .build/ci-versions.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#
# Copyright 2024 tosit.io
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

########### ~~~ CI TEST MATRIX VERSIONS ~~~ ######################################################################################
########### DEFINE A CI REFERENCE COMBINATIONS TO TEST AND PREVENT TESTING ALL THE COMBINATION WITCH LEADS TO TAKE A LOT OF TIME #
#### PUT THE SPARK VERSIONS TO TEST IN CORRESPONDANCE WITH 'reference-versions.yml' FILE #########################################
#### !!! ANY DECLARED TEST VERSION WHICH IS NOT PRESENT IN 'reference-versions.yml' FILE IS SKIPPED DURING BUILD !!! #############
#### REMOVE, UPDATE OR ADD VERSIONS TO TEST ######################################################################################
versions:
# Maximum python version supported by spark-3.2.x: 3.9
# Java support: 8/11
- python_version: 3.9
spark_version: [3.2.4]
java_version: [11]
scala_version: [2.12]
hadoop_version: 3.2
# Maximum python version supported by spark-3.3.x: 3.10
# Java support: 8/11/17
- python_version: '3.10'
spark_version: [3.3.4]
java_version: [17]
scala_version: [2.12, 2.13]
hadoop_version: 3
# Maximum python version supported by spark-3.4.x: 3.11
# Java support: 8/11/17
- python_version: 3.11
spark_version: [3.4.2]
java_version: [17]
scala_version: [2.12, 2.13]
hadoop_version: 3
# https://spark.apache.org/releases/spark-release-3-5-0.html
# Minimum supported java version: 17/21
- python_version: 3.11
spark_version: [3.5.1]
java_version: [17]
scala_version: [2.13]
hadoop_version: 3

60 changes: 60 additions & 0 deletions .build/images.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#
# Copyright 2024 tosit.io
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

images:
- name: docker.io/eclipse-temurin
tags:
- ${java_version}-jre-jammy
- name: spark-base
dependsOn: docker.io/eclipse-temurin
tags:
- spark-${spark_version}-scala-${scala_version}-java-${java_version}
- spark-${spark_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')
- spark-${spark_version}-scala-${scala_version}-java-${java_version}-${git_release_version}
- spark-${spark_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')-${git_release_version}
#- spark-${spark_version}-scala-${scala_version}-java-${java_version}-${git_commit_short_sha}
- name: spark
dependsOn: spark-base
tags:
- spark-${spark_version}-scala-${scala_version}-java-${java_version}
- spark-${spark_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')
- spark-${spark_version}-scala-${scala_version}-java-${java_version}-${git_release_version}
- spark-${spark_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')-${git_release_version}
#- spark-${spark_version}-scala-${scala_version}-java-${java_version}-${git_commit_short_sha}
- name: spark-py
dependsOn: spark
tags:
- spark-${spark_version}-python-${python_version}-scala-${scala_version}-java-${java_version}
- spark-${spark_version}-python-${python_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')
- spark-${spark_version}-python-${python_version}-scala-${scala_version}-java-${java_version}-${git_release_version}
- spark-${spark_version}-python-${python_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')-${git_release_version}
#- spark-${spark_version}-python-${python_version}-scala-${scala_version}-java-${java_version}-${git_commit_short_sha}
- name: spark-r
dependsOn: spark
tags:
- spark-${spark_version}-r-${r_version}-scala-${scala_version}-java-${java_version}
- spark-${spark_version}-r-${r_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')
- spark-${spark_version}-r-${r_version}-scala-${scala_version}-java-${java_version}-${git_release_version}
- spark-${spark_version}-r-${r_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')-${git_release_version}
#- spark-${spark_version}-r-${r_version}-scala-${scala_version}-java-${java_version}-${git_commit_short_sha}
- name: spark-py-r
dependsOn: spark-py
tags:
- spark-${spark_version}-python-${python_version}-r-${r_version}-scala-${scala_version}-java-${java_version}
- spark-${spark_version}-python-${python_version}-r-${r_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')
- spark-${spark_version}-python-${python_version}-r-${r_version}-scala-${scala_version}-java-${java_version}-${git_release_version}
- spark-${spark_version}-python-${python_version}-r-${r_version}-scala-${scala_version}-java-${java_version}-$(date '+%Y-%m-%d')-${git_release_version}
#- spark-${spark_version}-python-${python_version}-r-${r_version}-scala-${scala_version}-java-${java_version}-${git_commit_short_sha}
50 changes: 50 additions & 0 deletions .build/reference-versions.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#
# Copyright 2024 tosit.io
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

### REFERENCE MATRIX VERSIONS ##############################
#### !!! DOT NOT DELETE ANY ELEMENT !!! ####################
######## APPEND ONLY WHEN NEW SPARK VERSION IS REALEASED ###
############ USED AS REFERENCE DURING BUILD ################
versions:
# Maximum python version supported by spark-3.2.x: 3.9
# Java support: 8/11
- python_version: 3.9
spark_version: [3.2.1, 3.2.2, 3.2.3, 3.2.4]
java_version: [11]
scala_version: [2.12, 2.13]
hadoop_version: 3.2
# Maximum python version supported by spark-3.3.x: 3.10
# Java support: 8/11/17
- python_version: '3.10'
spark_version: [3.3.1, 3.3.2, 3.3.3, 3.3.4]
java_version: [17]
scala_version: [2.12, 2.13]
hadoop_version: 3
# Maximum python version supported by spark-3.4.x: 3.11
# Java support: 8/11/17
- python_version: 3.11
spark_version: [3.4.1, 3.4.2]
java_version: [17]
scala_version: [2.12, 2.13]
hadoop_version: 3
# https://spark.apache.org/releases/spark-release-3-5-0.html
# Minimum supported java version: 17/21
- python_version: 3.11
spark_version: [3.5.1]
java_version: [17]
scala_version: [2.12, 2.13]
hadoop_version: 3

50 changes: 50 additions & 0 deletions .build/release-versions.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#
# Copyright 2024 tosit.io
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

########### CURRENT MATRIX VERSIONS ################################################################################
#### PUT THE SPARK VERSIONS TO BUILD IN CORRESPONDANCE WITH 'reference-versions.yml' FILE ##########################
#### !!! ANY DECLARED VERSION WHICH IS NOT PRESENT IN 'reference-versions.yml' FILE IS SKIPPED DURING BUILD !!! ####
#### REMOVE, UPDATE OR ADD VERSIONS ################################################################################
versions:
# Maximum python version supported by spark-3.2.x: 3.9
# Java support: 8/11
- python_version: 3.9
spark_version: [3.2.1, 3.2.2, 3.2.3, 3.2.4]
java_version: [11]
scala_version: [2.12, 2.13]
hadoop_version: 3.2
# Maximum python version supported by spark-3.3.x: 3.10
# Java support: 8/11/17
- python_version: '3.10'
spark_version: [3.3.1, 3.3.2, 3.3.3, 3.3.4]
java_version: [17]
scala_version: [2.12, 2.13]
hadoop_version: 3
# Maximum python version supported by spark-3.4.x: 3.11
# Java support: 8/11/17
- python_version: 3.11
spark_version: [3.4.1, 3.4.2]
java_version: [17]
scala_version: [2.12, 2.13]
hadoop_version: 3
# https://spark.apache.org/releases/spark-release-3-5-0.html
# Minimum supported java version: 17/21
- python_version: 3.11
spark_version: [3.5.1]
java_version: [17]
scala_version: [2.12, 2.13]
hadoop_version: 3

40 changes: 40 additions & 0 deletions .github/actions/free-disk-space/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#
# Copyright 2024 tosit.io
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

name: Free disk space
description: Free Github runnner disk space

runs:
using: composite
steps:
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: false

# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: true
swap-storage: true



29 changes: 29 additions & 0 deletions .github/actions/setup-buildx/action.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#
# Copyright 2024 tosit.io
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

name: Set up QEMU and Docker Buildx
description: Set up Docker Buildx

runs:
using: composite
steps:
- name: Set up QEMU 📦
uses: docker/setup-qemu-action@v3

- name: Set up Docker Buildx 📦
uses: docker/setup-buildx-action@v3
with:
driver-opts: network=host
37 changes: 37 additions & 0 deletions .github/actions/setup-kind/action.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#
# Copyright 2024 tosit.io
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

name: Setup kind
description: Deploy kind cluster

runs:
using: composite
steps:
- name: Create k8s Kind Cluster
uses: helm/kind-action@v1
with:
# https://github.com/helm/kind-action?tab=readme-ov-file#inputs
verbosity: 10
cluster_name: "kind-ci-${{ github.job }}"
ignore_failed_clean: true # Ignore the post delete cluster action failing
wait: "180s" # Max timeout to wait Kind becomes ready

- name: Print Kind cluster state
run: |
kubectl cluster-info
kubectl get pods -A
kubectl describe node
shell: bash
Loading