$$ \lambda $$
Having spent the last year or so applying Generative AI / LLMs to interesting tasks related to health informatics and the challenges of using electronic health record (EHR) data for real-world use cases, I’ve started to develop a metaphor that helps structure the way I think about where we are with AGI / AI right now in late 2023.
If human intelligence represents an array of multimodal approaches, loops, interconnections, randomness interfering at times, and pathways through various kinds of structures and methods to accomplish tasks, then today’s most advanced generative AI that is available for public consumption (such as ChatGPT-4) is a bit like a lambda function for human intelligence.
This blog/tutorial post does a great job summarizing the concept of lambdas:
Lambda Function Origins
Anonymous functions, lambda expressions, or function literals are all the same thing. Lambda (or \lambda) is the name given to anonymous functions in some languages like Python. These are functions not bound by an explicit identifier. This name is derived from Lambda calculus, a mathematical system to express computation introduced by Dr. Alonzo Church in the 1930s.
Dr. Church’s incredible work was aimed at exploring the foundations of mathematics and the reduction of complex functions into simple 1-argument “lambda” expressions. These are small functions with no name that are used to make bigger functions through a process called currying. Currying means that multi-argument functions get broken down into simple 1-argument functions.
This means complex functions can be torn apart into anonymized bite-size chunks. Furthermore, creating simple anonymous functions is convenient when you don’t necessarily want to define a full function for a very simple task.
In my work in the data space, I started using lambdas when I wanted to apply a very simple function to a row of data in Pandas Dataframe that needed a few more complex steps than an off the shelf function. For example:
df['bar'] = df.apply(lambda row: row['foo'].upper() if row['foo'][0] in ['a','b','c'] else row['other'], axis=1)
You could write out a function that would do this, but when you’re scripting quickly, sometimes you want quick and dirty results (which is not to say that this represents a good coding practice that should be put into production)!
This full-fledged function, process_row_foobar
, does the same thing as the lambda function in the document. It checks if the first character of the 'foo' column value is in the list ['a', 'b', 'c']
. If it is, the function returns the uppercased 'foo' value. Otherwise, it returns the value from the 'some other column.
def process_row_foobar(row):
if row['foo'][0] in ['a', 'b', 'c']:
return row['foo'].upper()
else:
return row['other']
df['bar'] = df.apply(process_row, axis=1)
In this toy example, I’d say it’s actually smarter to use the full function because it’s more readable and you could re-use the same function elsewhere in the same flow to enhance your adherence to DRY (Don’t Repeat Yourself), and if you needed to add new logic to the function that’s used in multiple places, you’d only have to edit it once.
Nevertheless, there are situation where lambdas are helpful. Often, you can write or explore data starting with lambdas and then refactor to full functions later when you know what you want to accomplish with your code that’s going into production.
Lambda as metaphor has already made its way into modern cloud architecture, with Amazon’s serverless service to manage discrete tasks triggered by events named Amazon Lambda.
In my experience using LLM/Generative AI for production tasks on large healthcare datasets as of late 2023, it’s more appropriate to think of it as lambda for human intelligence than a candidate for AGI. If you entrust too many operations to it, things get lost in the sauce, meanings are misinterpreted, and hallucinations develop. Furthermore, it’s not easy right now to call a LLM over a large dataset (yet), because of limited context windows. Even when you can use plugins to do “advanced data analysis,” you’re limited by both the inherent limitations of the LLM and the size of the data that you can put into it.
Instead, you can think of LLMs as operating as lambdas for human intelligence. You can ask them a question like, “is the mapping between this value and this value correct? and provide a detailed description of your inputs and expected outputs and term definitions” and get good results.
In my experience, we’re not yet a point or close to it where you could turn over tasks that require complex chaining of thought processes, ambiguity, and inference of context completely over to AI agents with today’s available levels of capability (mid-November 2023). I may have to eat these words in the future very soon.
However, much like the relation between lambdas and more complex functions, there’s a lot of potential in finding workflows or processes that are currently 100% owned by human intelligence and decomposing them into parts that can be replicated with lambdas of human intelligence (Generative AI / LLMs) and those that can’t. Humans have always used tools to take over parts of our work — that’s why almost none of the people likely to read this have to spend a lot of time gathering small twigs and other material to sustain fires to keep warm in the coming winter.
Hopefully, the combination of human and artificial intelligence can become more powerful than the sum of its parts and help propel solving important problems as we move forward.