Fintech AI Review #9

Techno-optimism vs. regulation, LLMs vs. non-generative ML for underwriting, enterprise enthusiasm vs. the struggle for deployment, and the huge potential coming out of OpenAI DevDay

Nov 14, 2023

Welcome to all fellow aficionados of financial technology and artificial intelligence! In a previous issue, I joked (ok, half-joked) that we all must have a nightly “AI learning hour” to stay apprised of the activity in such a rapidly developing space. Now I wonder if an hour is enough! For evidence of this more is more approach, witness OpenAI’s DevDay last week, which rivaled a vintage Apple event for the number of “wait, there’s more” moments. An incredible number and variety of developments merit coverage and consideration, and that is how we’ll spend today’s newsletter.

The more we learn, the more questions arise, and the more potential paths emerge. Marc Andreessen’s Techno-Optimist Manifesto heralds the power of technology to improve any aspect of humanity. The Biden White House’s executive order on AI provides a sprawling perspective on government involvement in the field. AI OG Andrew Ng points out the risk of regulatory capture, where many of the government proposals to restrict AI are downstream of incumbent company efforts to stoke fear.

While LLMs are not ready to be used directly in making underwriting decisions, they can be quite valuable in creating some of the inputs required for more precise risk assessment, particularly in applying structure and labels to unstructured data. This was the subject of a very well-done panel I witnessed at Money 2020 last month. Meanwhile, there is robust quantitative evidence for the value of reinforcement learning algorithms in the optimization of credit limits, further demonstrating that ‘non-generative’ AI is highly applicable in financial services.

While many banks’ enthusiasm for generative AI has endured, even showing product demonstrations to regulators and rolling out customer-facing capabilities, other large institutions have struggled, particularly due to issues of talent, data infrastructure, and cost. Finally, I’ve included links to a couple recent videos: an interview where I discussed what the best lenders are doing to adopt AI and automation in lending, as well as a long, technical deep dive on the ability for LLMs to provide analytical assistance.

As always, please share your thoughts, ideas, comments, and any interesting content. If you like this newsletter, please consider sharing it with your friends and colleagues. Happy reading!

Latest News & Commentary

New models and developer products announced at DevDay - OpenAI

OpenAI hosted its first ever developer day this past week, and it was packed with big announcements. First, the company announced a new version of GPT-4 (GPT-4-Turbo), which has a 128k context window (i.e. you can put way more information in a prompt, 16x the previous version) and is also several times cheaper (one-third the price for input tokens and one-half the price for output tokens). It’s also trained through April 2023, adding roughly 2 years to the model’s ‘knowledge’ about the world. In addition, OpenAI introduced the ability to develop and use custom-built versions of chatGPT, known somewhat confusingly as “GPTs”. Developers, and even people who don’t code, can build custom agents, providing specific instructions, augmented knowledge, and the ability to call 3rd party APIs. GPTs can be shared privately or publicly, and an upcoming “GPT Store” will let developers monetize their creations. There was also a slew of developer-focused announcements, including but not limited to: improved function calling, a version of gpt-4 with vision support, a conversation threading API, a new text-to-speech API, and the ability to fine-tune gpt-4 and even work with OpenAI to build custom models.

The magnitude and cadence of new releases here is pretty impressive, and the impact on the industry is potentially significant. First, it demonstrated that many early-stage “AI products” developed by startups as thin wrappers over OpenAI might just become features of ChatGPT, thereby eroding any competitive moat. Next, while the GPT Store has not yet launched, there’s clearly potential for it to become the primary ‘app store’ for conversational AI agents. Previously, developers who wanted to create applications on top of LLMs with a particular knowledge base and 3rd-party API integrations had to build more complex toolchains, often using frameworks such as LangChain or LlamaIndex. Now that it’s possible to do this in a no-code environment within ChatGPT itself, it will be fascinating to see what apps become available given a theoretically much wider developer base. For financial services, the impacts are less clear, other than to say that whatever you were building may have just gotten easier, cheaper, and faster to deploy. In addition to public-facing AI apps, I’d be eager to witness the proliferation of “GPTs” within financial institutions as shareable productivity-boosters.

The Techno-Optimist Manifesto - Marc Andreessen, a16z
In this rousing manifesto, Marc Andreessen lays out a passionate and detailed case for techno-optimism, the belief that the proper development and use of technology is critical for and capable of solving most any problem, ultimately increasing the well-being of humanity. He begins with the assertion that we are being lied to, manipulated in the discourse by those who preach pessimism, resentment, and doom. Rather, he contends, technology has been the driving factor in the improvement of the human condition and a ‘lever on the world’. You may have already read Marc’s manifesto, but if you haven’t, you really should read it in full. It’s thought-provoking, bold, and for those of us who build and invest in technology out of a love for humanity, a refreshing antidote to the doomerism so pervasive in popular culture and politics.

Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence - The White House

I can’t promise a twenty-thousand word executive order will be as exciting as the Techno-Optimist Manifesto covered above, but it’s obviously valuable to understand all sides of the AI regulation debate, including the impact on financial services, mentioned as a “critical field” in terms of AI impact. Biden’s order contains some good ideas, some bad ideas, and some misguided ideas, all too numerous and detailed to discuss in a single issue of this newsletter. The EO is much less a last word than an opening volley, and we’ll clearly see a ton of AI-related policy activity going forward, especially heading into the presidential election season of 2024.

Google Brain cofounder says Big Tech companies are inflating fears about the risks of AI wiping out humanity because they want to dominate the market - Business Insider
Amidst plenty of big news in AI policy and regulation, AI legend Andrew Ng weighed in with his view, namely that the biggest tech vendors are creating a climate of unjustified fear to achieve regulatory capture. According to Ng, "There are definitely large tech companies that would rather not have to try to compete with open source, so they're creating fear of AI leading to human extinction. It's been a weapon for lobbyists to argue for legislation that would be very damaging to the open-source community." There’s definitely a big-tech vs. open-source war brewing around AI. Many of the fear-based arguments look similar to those used by large incumbents trying to restrict more widespread development of previous technical advances, including encryption and the internet itself. As we’ve seen in so many other sectors, over-regulation can lead to many unintended consequences, including the entrenchment of the few large incumbents who can bear the regulatory burden. In fact, given how many times we’ve seen this exact phenomenon, in industries as diverse as banking, energy, pharma, housing, and agriculture, it’s almost odd how not-attuned people are to the risk of regulatory capture. Ng published some more thoughts on this on his company’s blog.

Money20/20 Vegas Panel Recap: ChatGPT can write, but can AI underwrite? - Taktile
A couple weeks ago, I was in Las Vegas for Money2020, and even with an incredibly packed schedule, this panel was one of the few I made sure to attend. One of the panelists, Maik Taro Wehmeyer of Taktile, a modern decision engine for financial services, wrote an excellent recap of what I also found to be a compelling yet grounded conversation. Unlike other recent would-be revolutions in fintech (e.g. P2P lending, crypto), the panelists were quite rational and measured in their estimations of the capabilities of LLMs in underwriting. This very much aligns with my point of view as someone who has developed many risk models and strategies over the years using multiple generations of machine learning and has also experimented quite a bit with LLMs. While the non-deterministic and relatively unexplainable nature of LLMs make them too risky for actual underwriting decisions (especially in highly-regulated markets like the U.S. and E.U.), they do have the power and potential to contribute to better risk prediction. One of the most compelling use cases is the ability for LLMs to add structure to unstructured data, creating labels that can then be used as inputs into better-understood and more observable machine learning architectures. This allows lenders to incorporate datasets that intuitively have some explanatory power but would otherwise be impractical to include in credit decisioning without manual intervention and all its inherent problems (cost, time, bias, etc.). I’m optimistic for this approach to using AI in the service of more accurate and faster credit decisioning using more diverse data sources and glad to see others ignoring ‘hype’ and approaching these issues in well-thought-out ways.

Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning - Alfonso-Sánchez et. al.
I’ve often observed that while a ton of time and effort is spent by lenders, investors, data companies, and researchers trying to analyze and predict borrowers’ propensity to default, comparatively little attention is paid to the other strategic levers that affect the profitability of a credit portfolio, such as the optimization and management of credit limits. In my experience, there is great benefit to applying thoughtful analytics across the entire customer lifecycle of credit, which is why I was glad to come across this paper from researchers at Western University in Canada and the Universidad Nacional de Colombia.

“Although credit limit setting is an essential problem for traditional banking industries and Fintech companies, since identifying the adequate credit limit will define the profit and therefore the sustainability of the credit card portfolio, this question has not been widely studied. This contrasts with, for example, the default prediction problem, in which a large number of banking analytic research papers are published.”

The authors set out to explore the effectiveness of reinforcement learning (a family of machine learning techniques, a.k.a. “RL”) for developing an optimal policy for adjusting limits on credit cards, thereby balancing two conflicting objectives: maximizing portfolio revenue and minimizing expected losses. To do so, they used data from the credit card product of a Latin American financial “super app”. There are some interesting technical/mathematical details in the paper, including data structuring and experimental design work to account for the fact that the ‘rewards’ from a credit limit adjustment are not immediate, unlike those in many common RL applications. The researchers found that the “Double-Q” reinforcement learning algorithm they developed outperformed other strategies in determining optimal credit limit adjustments. An additional finding was that ‘alternative data’ collected from the super app did not add value above typical financial data in developing a secondary ML model used to predict card balances post credit limit adjustment. Though highly technical, the methodology and results are quite interesting and also a great example of the value of what one might call ‘non-generative AI’ in lending risk decisions.

Gretel's Tabular LLM

Gretel, a developer-focused synthetic data generation platform, released a “Tabular LLM” in early preview. The tool allows users to generate, augment, and edit tabular data with natural language prompts. While I haven’t personally experienced this product, this is the sort of format-specific model that would have many intuitive and valuable use cases in financial services. To name just a few fairly common problems potentially made easier: generating realistic synthetic data that has the same information value as a real world dataset but preserves privacy; filling in missing values in a sparse dataset, generating a dataset of realistic customer profiles with diverse financial profiles across thousands of scenarios for the purposes of testing an application or simulating performance; constructing ‘stressed’ portfolio scenarios while incorporating random noise in order to stress test models and risk strategies. Of course, the risk here is that the generated datasets would be overly dependent on the model’s training data and therefore not truly useful for a seriously specific analytical use case. If so, the tool would be great for product testing and user demo creation but fall short of its true potential. Nevertheless, this is a fantastic idea with great promise, and I’d be excited to give it a try.

A year after ChatGPT’s launch, how do banks stack up? - BankingDive
The launch of ChatGPT and its subsequent iterations have catalyzed a massive wave of enthusiasm across multiple sectors of business, including financial services. It’s hard to find a single large bank that hasn’t dedicated substantial talent, budget, and mindshare to exploring and developing uses for AI tools, whether to streamline internal processes or perform client-facing interactions. This piece in Banking Dive provides an update on the efforts of several large institutions, including Goldman Sachs and JP Morgan, where enthusiasm has apparently not waned. JP Morgan, for example, has been demonstrating its generative AI projects to regulators, seeking their involvement and feedback, particularly around the controls they would put in place. This is as expected, given generative AI’s potential to upend the current paradigm of model risk management generally used in the industry and by supervisory bodies. While Goldman Sachs claims not to be using generative AI in any client-facing projects, other banks are specifically targeting client interaction as an application of the technology. Morgan Stanley, for instance, is apparently testing an AI bot meant to provide assistance to financial advisors. It will be interesting to see how these banks continue to work both internally and with regulators and to observe how soon such applications will be deployed in production. As J.P. Morgan’s Chief Information Officer remarked: “It’s not a tomorrow thing”.

Companies struggle to deploy AI due to high costs and confusion - Axios

It’s been amazing to see so many companies large and small dive into experimentation with AI technologies, perhaps in an effort to avoid the often too-late adoption of previous technologies, such as mobile computing and the cloud. This is particularly true in financial services, where even large institutions generally not seen as early adopters have announced extensive efforts to incorporate AI into their products and businesses. This piece from Axios reveals that despite the potential, many such companies are struggling with the real-world deployment of AI. The reasons for this include cost, data infrastructure, and talent. AI models can be incredibly expensive to train, tune, and run, particularly when a use case is overly broad. Unlike general open source software or public cloud technologies, even the experimentation is expensive. AI also requires the right sort of data organization, a capable and flexible technical architecture, and the talent to build and deploy what are often very different types of computing workloads. Interestingly, Deloitte and Nvidia have a partnership to provide consulting services for just these problems. Now that companies are shifting from the ‘this is so exciting’ to the ‘this is harder than we thought’ phase, it will be interesting to see which efforts cross the chasm into production and which end up abandoned by companies that find it too difficult to perform the sufficient technical or cultural transformation.

AI and automation in lending: What are the best lenders doing? - Finovate Interview
Back in September, while I was at the Finovate conference in NYC, I spoke to research analyst David Penn about how the best lenders are using AI today. We touched on how some of the top lenders in business and consumer lending are using Ocrolus (where I work), explained how lenders are using cash flow analytics to manage risk, and tackled some of the most common misconceptions around the use of AI in financial services. It was an enjoyable conversation, and I hope you enjoy watching it!

Can an AI be your analytics intern? - FintechAIReview

In case you missed it, I recently recorded a technical deep-dive on LLM-based data analysis tools, including ChatGPT Advanced Data Analysis and LIDA, an open-source project from Microsoft Research. One of the most useful ways to think about AI is that it gives you a bunch of interns. Just like an intern, it’s not perfect, it doesn’t know everything, and you probably have to check the work, but it might help give you more leverage and amplify your work. In the video, I use publicly available securitized auto loan data to explore the capabilities of AI tools for analysis. You can see how the tools plan analysis, write code, generate graphs, and explain their conclusions. I even show some of the code underneath LIDA to reveal how such a tool ‘prompt-engineers’ the LLM to produce high-quality results. Interestingly, interacting with an LLM as an assistant actually might teach us how to better specify tasks for human assistants as well!

Fintech AI Review

Fintech AI Review #9

Techno-optimism vs. regulation, LLMs vs. non-generative ML for underwriting, enterprise enthusiasm vs. the struggle for deployment, and the huge potential coming out of OpenAI DevDay

Latest News & Commentary

Discussion about this post