AI and Machine Learning

Enhancing Customer Support with Fine-tuned Falcon LLM

In our recent proof of concept, we explored fine-tuning the Falcon Large Language Model (LLM) using Quantized Low-Rank Adapter (QLoRA) techniques to improve customer support responses. Operating within the constraints of a single GPU, we curated and refined over 100,000 query-response pairs and optimized the training process with Hugging Face Transformers. The project yielded noteworthy enhancements in response time, accuracy, and projected user satisfaction. These findings offer valuable insights into the potential of advanced LLM fine-tuning methods for customer support applications, even when hardware resources are limited.

Azumo Research
March 21, 2024

As part of our ongoing efforts to advance software solutions using AI, we recently undertook a proof of concept to fine-tune a Large Language Model (LLM) for improving customer support responses. Through our experience providing LLM fine-tuning services to our customers, we’ve come to appreciate the significant benefits that fine-tuning offers. Utilizing the Falcon LLM and Quantized Low-Rank Adapter (QLoRA) techniques, our objective was to evaluate the extent of improvement achievable in response accuracy and efficiency, all within the limitations of a single GPU. This project was an exploratory exercise aimed at pushing the boundaries of AI and LLM fine-tuning, rather than deploying a production-ready system.

Our Approach

Selecting the Falcon LLM

We chose the Falcon LLM for its flexibility and strong performance in handling a variety of queries. Its architecture provided a suitable platform for experimentation, allowing us to investigate its capabilities in generating enhanced customer support responses.

Data Curation

Recognizing the importance of high-quality data, we compiled and refined over 100,000 query-response pairs from previous customer interactions. This carefully curated dataset ensured relevance and accuracy, serving as a solid foundation for the fine-tuning process.

Implementing QLoRA

A key aspect of our proof of concept was the application of QLoRA. This technique enabled efficient fine-tuning of the large language model on a single GPU by combining low-rank approximation with quantization. QLoRA allowed us to retain essential model features while reducing the number of parameters, making the process feasible despite hardware constraints.

Optimizing Training with Hugging Face Transformers

We employed the Hugging Face Transformers library for the fine-tuning process. Through extensive testing and parameter adjustments, we optimized the training configuration. The 8-bit Paged Adam optimizer was instrumental in managing the model size and achieving optimal performance within our hardware limitations.

Challenges and Insights

Hardware Limitations

Operating within the memory constraints of a single GPU posed a significant challenge. By leveraging QLoRA’s quantization methods, we effectively reduced the model size without compromising performance, offering valuable insights into overcoming hardware limitations with advanced techniques.

Maintaining Contextual Accuracy

Ensuring the model’s responses remained contextually accurate was critical. We implemented continuous validation throughout the training to maintain the relevance and usefulness of the generated customer support responses.

Results

Although this was a proof of concept and not intended for production deployment, the improvements observed were noteworthy:

Response Time: Decreased from an average of 6 seconds to 2.6 seconds.

Accuracy: Improved from 85% to 96% in generating appropriate responses.

Projected User Satisfaction: Indications suggest a substantial increase due to faster and more accurate responses.

Training Report

This proof of concept has provided valuable insights into the potential of fine-tuning LLMs for customer support applications. By successfully adapting the Falcon LLM using QLoRA techniques on a single GPU, we demonstrated significant improvements in response time and accuracy. These findings will inform our future endeavors in AI and contribute to our goal of enhancing software solutions through innovative approaches.

For those interested in our work, the trained model from this project is available here: https://huggingface.co/azumo/falcon-7b-gsm8k

We are Azumo
and we get it

We understand the struggle of finding the right software development team to build your service or solution.

Since our founding in 2016 we have heard countless horror stories of the vanishing developer, the never-ending late night conference calls with the offshore dev team, and the mounting frustration of dealing with buggy code, missed deadlines and poor communication. We built Azumo to solve those problems and offer you more. We deliver well trained, senior developers, excited to work, communicate and build software together that will advance your business.

Want to see how we can deliver for you?

schedule my call

Benefits You Can Expect

Release software features faster and maintain apps with Azumo. Our developers are not freelancers and we are not a marketplace. We take pride in our work and seat dedicated Azumo engineers with you who take ownership of the project and create valuable solutions for you.

Industry Experts

Businesses across industries trust Azumo. Our expertise spans industries from healthcare, finance, retail, e-commerce, media, education, manufacturing and more.

Illustration of globe for technology nearshore software development outsourcing

Real-Time Collaboration

Enjoy seamless collaboration with our time zone-aligned developers. Collaborate, brainstorm, and share feedback easily during your working hours.

vCTO Solution Illustration

Boost Velocity

Increase your development speed. Scale your team up or down as you need with confidence, so you can meet deadlines and market demand without compromise.

Illustration of bullseye for technology nearshore software development outsourcing

Agile Approach

We adhere to strict project management principles that guarantee outstanding software development results.

Quality Code

Benefits from our commitment to quality. Our developers receive continuous training, so they can deliver top-notch code.

Flexible Models

Our engagement models allow you to tailor our services to your budget, so you get the most value for your investment.

Client Testimonials

Zynga

Azumo has been great to work with. Their team has impressed us with their professionalism and capacity. We have a mature and sophisticated tech stack, and they were able to jump in and rapidly make valuable contributions.

Zynga
Drew Heidgerken
Director of Engineering
Zaplabs

We worked with Azumo to help us staff up our custom software platform redevelopment efforts and they delivered everything we needed.

Zaplabs
James Wilson
President
Discovery Channel

The work was highly complicated and required a lot of planning, engineering, and customization. Their development knowledge is impressive.

Discovery Channel
Costa Constantinou
Senior Product Manager
Twitter

Azumo helped my team with the rapid development of a standalone app at Twitter and were incredibly thorough and detail oriented, resulting in a very solid product.

Twitter
Seth Harris
Senior Program Manager
Wine Enthusiast

Azumo's staff augmentation service has greatly expanded our digital custom publishing capabilities. Projects as diverse as Skills for Amazon Alexa to database-driven mobile apps are handled quickly, professionally and error free.

Wine Enthusiast Magazine
Greg Remillard
Executive Director
Zemax

So much of a successful Cloud development project is the listening. The Azumo team listens. They clearly understood the request and quickly provided solid answers.

Zemax
Matt Sutton
Head of Product

How it Works

schedule my call

Step 1: Schedule your call

Find a time convenient for you to discuss your needs and goals

Step 2: We review the details

We estimate the effort, design the team, and propose a solution for you to collaborate.

Step 3: Design, Build, Launch, Maintain

Seamlessly partner with us to confidently build software nearshore

We Deliver Every Sprint

Time Zone Aligned Developers

Our nearshore developers collaborate with you throughout your working day.

Experienced Engineers

We hire mid-career software development professionals and invest in them.

Transparent Communication

Good software is built on top of honest, english-always communication.

We Build Like Owners

We boost velocity by taking a problem solvers approach to software development.

You Get Consistent Results

Our internal quality assurance process ensures we push good working code.

Agile Project Management

We follow strict project management principles so we remain aligned to your goals