AI and Machine Learning

Enhancing Customer Support with Fine-tuned Falcon LLM

In a bid to revolutionize customer support efficiency, the blog post details the journey of optimizing the Falcon LLM model using QLoRA techniques. The objective was to develop a context-aware solution leveraging a single GPU. The process involved selecting Falcon LLM for its adaptability, refining a dataset of over 100,000 query-response pairs, and employing QLoRA for model optimization. Challenges like GPU constraints and context relevance were addressed, leading to significant improvements in response time, accuracy, and user satisfaction.

Azumo Research
March 21, 2024
illustration for outsourcing


In our pursuit to remain at the forefront of AI-driven solutions, we embarked on an ambitious internal project. Recognizing the potential of AI in enhancing customer support, we sought to develop a solution that delivered both efficiency and precision.


Adapt the Falcon LLM to deliver context-aware and efficient customer support responses, leveraging the power of QLoRA and the efficiency of a single GPU.

  • Choosing the Base - Falcon LLM: We decided upon the Falcon LLM due to its adaptability and capacity to handle diverse queries.
  • Tailoring the Dataset: We extracted and refined over 100,000 query-response pairs from the past interactions, ensuring the data's quality and relevance.
  • QLoRA - The Game Changer: To optimize for the single GPU constraint, we employed QLoRA, a technique designed to fine-tune large models without compromising on performance. QLoRA’s balance of low-rank approximation and quantization allowed us to maintain essential model features with fewer parameters.
  • Optimized Training: Using the Hugging Face Transformers library, we tailored training parameters after extensive testing, ensuring optimal performance. The 8-bit paged atom optimizer was pivotal in managing the model's large size.

Challenges Tackled

  • Navigating GPU Constraints: The memory limitations of a single GPU were addressed using QLoRA's quantization, enabling us to reduce model size without sacrificing its effectiveness.
  • Context Relevance: Continuous validation checks ensured the model's relevance to unique customer interactions.

Performance Highlight: Before/ After

Response Time:
  • Before: Average of 6 seconds
  • After: Reduced to 2.6 seconds
Accuracy Rate:
  • Before: 85% accuracy in automated responses
  • After: A leap to 96% accuracy

User Satisfaction:

  • Before: 75% satisfaction rate
  • After: 92% satisfaction rate

QLoRA Training of Falcon-7B with GSM8K Dataset

Training Report


This initiative not only showcased our capability to innovate but also underscored our dedication to enhancing every aspect of our operations. By integrating the Falcon LLM optimized with QLoRA, we reimagined what AI-driven customer support could achieve, all on a single GPU.

For reference, you can access the trained model from here:

Vector Database

We are Azumo
and we get it

We understand the struggle of finding the right software development team to build your service or solution.

Since our founding in 2016 we have heard countless horror stories of the vanishing developer, the never-ending late night conference calls with the offshore dev team, and the mounting frustration of dealing with buggy code, missed deadlines and poor communication. We built Azumo to solve those problems and offer you more. We deliver well trained, senior developers, excited to work, communicate and build software together that will advance your business.

Want to see how we can deliver for you?

schedule my call

Benefits You Can Expect

Release software features faster and maintain apps with Azumo. Our developers are not freelancers and we are not a marketplace. We take pride in our work and seat dedicated Azumo engineers with you who take ownership of the project and create valuable solutions for you.

Industry Experts

Businesses across industries trust Azumo. Our expertise spans industries from healthcare, finance, retail, e-commerce, media, education, manufacturing and more.

Illustration of globe for technology nearshore software development outsourcing

Real-Time Collaboration

Enjoy seamless collaboration with our time zone-aligned developers. Collaborate, brainstorm, and share feedback easily during your working hours.

vCTO Solution Illustration

Boost Velocity

Increase your development speed. Scale your team up or down as you need with confidence, so you can meet deadlines and market demand without compromise.

Illustration of bullseye for technology nearshore software development outsourcing

Agile Approach

We adhere to strict project management principles that guarantee outstanding software development results.

Quality Code

Benefits from our commitment to quality. Our developers receive continuous training, so they can deliver top-notch code.

Flexible Models

Our engagement models allow you to tailor our services to your budget, so you get the most value for your investment.

Client Testimonials


Azumo has been great to work with. Their team has impressed us with their professionalism and capacity. We have a mature and sophisticated tech stack, and they were able to jump in and rapidly make valuable contributions.

Drew Heidgerken
Director of Engineering

We worked with Azumo to help us staff up our custom software platform redevelopment efforts and they delivered everything we needed.

James Wilson
Discovery Channel

The work was highly complicated and required a lot of planning, engineering, and customization. Their development knowledge is impressive.

Discovery Channel
Costa Constantinou
Senior Product Manager

Azumo helped my team with the rapid development of a standalone app at Twitter and were incredibly thorough and detail oriented, resulting in a very solid product.

Seth Harris
Senior Program Manager

So much of a successful Cloud development project is the listening. The Azumo team listens. They clearly understood the request and quickly provided solid answers.

Matt Sutton
Head of Product
Bento for Business

Azumo came in with a dedicated team that quickly grasped our problem and designed and built our data integration solution. They delivered a clearer picture for our business in a timeframe I didn’t think was possible.

Bento for Business
Sean Anderson
Chief Operating Officer

How it Works

schedule my call

Step 1: Schedule your call

Find a time convenient for you to discuss your needs and goals

Step 2: We review the details

We estimate the effort, design the team, and propose a solution for you to collaborate.

Step 3: Design, Build, Launch, Maintain

Seamlessly partner with us to confidently build software nearshore

We Deliver Every Sprint

Icon illustrating the advantage of time zone-aligned software developers from Azumo, ensuring work hours synchronized with client schedules.

Time Zome Aligned

Our nearshore developers collaborate with you throughout your working day.

Icon showcasing the advantage of hiring expert engineers from Azumo for software development services.

Experienced Engineers

We hire mid-career software development professionals and invest in them.

Icon symbolizing how Azumo's software developers prioritize honest, English-always communication for building quality software.

Transparent Communication

Good software is built on top of honest, english-always communication.

Icon representing how Azumo's developers enhance velocity by approaching software development with a problem solver's mindset.

Build Like Owners

We boost velocity by taking a problem solvers approach to software development.

Icon illustrating how Azumo's quality assurance process ensures the delivery of reliable, working code for every project.

Expect Consistent Results

Our internal quality assurance process ensures we push good working code.

Icon depicting how Azumo follows strict project management principles to stay aligned with your goals throughout the development process.

Agile Project Management

We follow strict project management principles so we remain aligned to your goals