Lambda layers - great idea ! Stash some code / libraries in a re-usable block, so you can then write smaller lambdas. Great for importing libraries and other things.
Only … I get confused, especially when I need to install a Python layer that has compiled parts - so I need to compile on the right OS.
So here goes …
Python
First example, requests
Worked through the first example. Worked cleanly and made sense.
It’s using a venv. The venv is called create_layer
and the packages are found in create_layer/lib/python3.12/site-packages/certifi
.
Then the magic happens with:
mkdir python
cp -r create_layer/lib python/
zip -r layer_content.zip python
It copies everything from the lib layer into a new folder called python
which it then uses as the root of the zip …
python/
python/lib/
python/lib/python3.12/
python/lib/python3.12/site-packages/
python/lib/python3.12/site-packages/pip-24.0.dist-info/
python/lib/python3.12/site-packages/idna/
python/lib/python3.12/site-packages/idna/__pycache__/
python/lib/python3.12/site-packages/urllib3-2.2.2.dist-info/
python/lib/python3.12/site-packages/urllib3-2.2.2.dist-info/licenses/
python/lib/python3.12/site-packages/charset_normalizer/
...
And you can create the layer from the CLI like
aws lambda publish-layer-version --layer-name python-requests-layer \
--zip-file fileb://layer_content.zip \
--compatible-runtimes python3.11 \
--compatible-architectures "arm64"
Second example, xgboost
The second example in the documentation installs numpy
, I decided to go straight to xgboost
. For the numpy
example they specify a URL for the exact wheel; for xgboost it looks like it’s just the package name.
So the first shell file was:
python3.12 -m venv create_layer
source create_layer/bin/activate
pip install -r requirements.txt --platform=manylinux2014_x86_64 --only-binary=:all: --target ./create_layer/lib/python3.12/site-packages
requirements.txt
was:
xgboost
and the second shell file was:
mkdir python
cp -r create_layer/lib python/
zip -r layer_content.zip python
which gives me an 89Mb zip file : too large to upload direct, but I can work around it.
Expected this …
aws lambda publish-layer-version --layer-name python-xgboost-layer \
--zip-file fileb://layer_content.zip \
--compatible-runtimes python3.12 \
--compatible-architectures "arm64"
An error occurred (RequestEntityTooLargeException) when calling the PublishLayerVersion operation: Request must be smaller than 70167211 bytes for the PublishLayerVersion operation
So created manually.
And here’s a sample lambda I can use:
import json
import xgboost as xgb
def lambda_handler(event, context):
# Example usage of XGBoost
dmatrix = xgb.DMatrix([[1, 2], [3, 4]], label=[0, 1])
params = {
'objective': 'binary:logistic',
'max_depth': 2,
'eta': 1,
'nthread': 2,
'eval_metric': 'auc'
}
bst = xgb.train(params, dmatrix, num_boost_round=2)
return {
'statusCode': 200,
'body': json.dumps('XGBoost Lambda Layer Test Successful')
}
And that … worked. I have a lambda executing XGBoost. Ok, it’s giving me a deprecation warning, but it did work.
Slight feeling of anticlimax, but there you go. And full credit to the documentation, a nicely worked example.
Mwah hah hah …
OK, when I tried it with my lambda, I got an error:
[ERROR] ImportError: sklearn needs to be installed in order to use this module
Traceback (most recent call last):
File "/var/task/classification.py", line 21, in lambda_handler
ev_calculator = EVCalculator('/tmp/model.json')
File "/var/task/calculator.py", line 8, in __init__
self.model = XGBClassifier()
File "/opt/python/lib/python3.12/site-packages/xgboost/core.py", line 738, in inner_f
return func(**kwargs)
File "/opt/python/lib/python3.12/site-packages/xgboost/sklearn.py", line 1443, in __init__
super().__init__(objective=objective, **kwargs)
File "/opt/python/lib/python3.12/site-packages/xgboost/sklearn.py", line 737, in __init__
raise ImportError(
And then I can’t do the same trick as above with scikit-learn:
Failed to create layer version: Unzipped size must be smaller than 262144000 bytes
Why is the code different ? Ah, the example code just does some training, my code does more. I see other people having the same issue.
Took at look at the scikit and removed numpy
- maybe I should try them together next ? Went from 90Mb to 60Mb.
Nope, as separate I got:
Layers consume more than the available size of 262144000 bytes
Argh. Combined together I got:
Failed to create layer version: Unzipped size must be smaller than 262144000 bytes.
Solution coming
Container lambdas. That solved the problem with size. Added a little complexity concerning how to build and deploy the image to EC$, but it’s working nicely.
Java
That’s next week.