Some Lambda Optimisations

Phil Larkin / Software Engineer | 10/05/2022

Some Lambda Optimisations

Since lockdown, I’ve “been to” a number of AWS online events, and watched a number of random YouTube videos with people playing with Lambda, the Function as a Service (FaaS) offering from AWS. ‘Been to’ is in quotes because of course I went nowhere, just like the rest of us, but I digress. Time and again I’ve heard the phrase, ‘don’t store anything important locally’, when speaking of AWS Lambda. Then in my studying I’ve learned how to basically cache reusable objects, such as database connections, within your Lambda. However, this left me with a fuzzy bit of understanding as to what happens to the instance variables associated with the actual method called when my Lambda is used, specifically when it comes to an object oriented language. Most of the examples out there are in Python or Javascript, after all.

To be a bit clearer, in creating a Java based Lambda we create a non-static method within a class to be called when the Lambda is invoked. Is a new instance of this class created every time? Is it cleared down somehow? I mean sure I could trawl the internet for an answer and I’d probably find it in the AWS documentation somewhere, or I could write some of my own code… And I do love writing code.

What We Know

The fairy tale is that you give AWS your function, and they just run it. No hardware, no virtual machine, no container, no nothing! But we are grown ups and we know better. AWS in fact spins up a micro-VM to run your Lambda on. While this instance is being regularly invoked it remains around. If it starts taking more simultaneous invocations more of them are spun up to deal with the traffic. If it’s not called for a while, it’s thrown out and AWS reclaims that CPU/hard drive space for someone else’s Lambdas. You can see why if you’ve stored something secure locally, like bank details, the next invoker would have access to that, and mistakes could be made. I’ll get into the best practices a little bit at the end. Overall we know that we might very well hit the same container on subsequent invocations of the Lambda, and thus have access to the same bit of a file system and even non-local variables.

Rather than calling the reuse of the same micro-VM a bug, AWS packaged it as a feature and told us that we can use it as a means of “caching” setup steps that would be common between invocations. Here’s a Python example pulled from AWS documentation and cut down for brevity’s sake:

try:
 conn = pymysql.connect(host=rds_host, user=name, passwd=password, db=db_name, connect_timeout=5)
except pymysql.MySQLError as e:
 logger.error(e)
 sys.exit()

def handler(event, context):

 with conn.cursor() as cur:
  cur.execute('insert into Employee (EmpID, Name) values(1, "Joe")')
  conn.commit()

 return "Success"


You can see how the database connection ‘conn’ is made outside the handler function, so that it is only run the first time the Lambda is invoked, but the connection is used every time. To me that’s perfectly clear, what’s not clear is how that translates, when my function is in fact the method of a class, like it is with a Java based Lambda.

What I Did

Well I could have tried to read stuff to figure this out, but that’s boring, so I did stuff. The best way to resolve this, I figured, was to write a Java based Lambda function that tests what types of variables are persisted between invocations. Obviously, local variables (variables declared within the method) will be recreated every time, but what of static and instance variables. Here’s my code:

public class TestLambda implements RequestHandler<Object, OutputObject>{

 static int staticVarCount = 0;
 int instanceVarCount = 0;

 public OutputObject handleRequest(final Object event, final Context context) {

   staticVarCount++;
   instanceVarCount++;

   final String rsBody = "Java Lambda:\n"
    + "I've seen that static variable " + staticVarCount + " time(s).\n"
    + "I've seen that instance variable " + instanceVarCount + " times(s).";

   System.out.println(rsBody);
   return new OutputObject(200, rsBody);
 }
}


The output was:

Java Lambda:
I've seen that static variable 1 time(s).
I've seen that instance variable 1 times(s).
Java Lambda:
I've seen that static variable 2 time(s).
I've seen that instance variable 2 times(s).
Java Lambda:
I've seen that static variable 3 time(s).
I've seen that instance variable 3 times(s).


Etc, etc, etc… So that was simple! Turns out we can declare our common setup variables statically or non-statically in an instance variable… To push that further we can put our setup code in the constructor, which will only be invoked when the micro-VM is being initialized. So that’s handy.

What AWS Says

If you are talking to anyone from AWS or anyone who loves serverless on AWS, this is totally a really useful feature to speed up our Lambda functions, definitely not a confusing bug. In its defence, it seems to be quite a cute useful bug. Let’s keep it! AWS also puts aside 512MB of space, that is including your code, that you could store files required during the running of your Lambda, and have given the Lambda full permission to create and edit files in the /tmp folder. If you can use this as some sort of cache or not is down to you and your use case. I can imagine a use of Python’s couch library here maybe.

What’s The Point?

Bug, feature, whatever! This is a nice little trick that we can use to be better at serverless and it’s best to take the wins where we can. My goal in the experiment was to understand its ramifications in Java. My goal in writing this insight was to share my broader learning with you. So I was at least half successful! As always my code is on github, if you want to have a go yourself. Until next time.


Message icon

Explore our latest insights

INSIGHT LIBRARY

Lisa Wood / Marketing and Communications Lead

18/03/2024

Lisa Wood / Marketing and Communications Lead

11/03/2024

Lisa Wood / Marketing and Communications Lead

11/03/2024

Message icon

Come and help us on our mission...

Join our team