Episode 
28

The Software Developer's AI Cheat Sheet Revealed Ep. 027

In this episode of The Programming Podcast, Danny Thompson and Leon Noel are joined by special guest Santiago, a machine learning engineer and teacher of the Machine Learning School cohort, for one of the most actionable and grounded AI discussions in tech today.


We cut through the hype and dive deep into:

- How AI is (and isn’t) replacing developers

- Using AI as a co-pilot vs letting AI drive

- Vector databases, embeddings, and Retrieval-Augmented Generation (RAG)

- Model Context Protocol (MCP), agent-based systems, and why no one agrees on what an agent is

- The underhyped power of edge ML and TinyML

- Real-world agent use cases (like automating DocuSign workflows)

- Getting started with AI as a developer today — beyond building chatbots


Follow our Guest Co-host:

https://x.com/svpino

https://www.ml.school/

https://www.youtube.com/@UCgLxmJ8xER7Y7sywMN5SfWg


This episode is loaded with insights for junior devs, industry veterans, and anyone curious about building real AI-enhanced applications.


🔔 Don’t forget to like, comment, and subscribe for more developer-first conversations!


⏱️ Timestamps & Chapters:

0:00 – Will AI Replace Software Developers?

0:45 – Embracing AI as a Developer Tool

2:00 – Meet the Hosts and Guest (Danny, Leon, Santiago)

3:20 – Leon’s AI Skepticism Journey

4:30 – Why “AI Won’t Replace You, But Someone Using It Might”

6:30 – Becoming a Solo Builder at Scale with AI

8:15 – Real Examples of Using AI to Get More Done

9:45 – AI and the Power of Prototyping

11:00 – AI Co-pilot vs. You Co-piloting AI

13:00 – Future of Tools Built for AI (MCP, A2A)

14:40 – Shifting Between Developer Modes with AI

17:00 – Intro to Vector Databases & Embeddings

20:45 – What is Retrieval-Augmented Generation (RAG)?

23:00 – Large Context Windows vs RAG: Is RAG Still Relevant?

25:30 – Indexing Speed & Clustering Explained with Spice Rack Analogy

27:50 – Why Grounding Matters (Reducing Hallucinations in LLMs)

31:00 – The Importance of the Ingestion Layer

32:10 – Other Crucial AI Trends: MCP, Agent-to-Agent

33:00 – Defining AI Agents (Why No One Agrees)

35:00 – The Browser Wars of Agent Definitions

36:30 – Overhyped vs Underrepresented AI Concepts

38:00 – Specialized Tools for AI Builders

39:45 – Agents Will Be the Next Microservices

41:00 – Real-World Example: AI + DocuSign + Slack + Asana Integration

43:00 – Underhyped: TinyML and AI on Edge Devices

45:00 – Apple’s Adapter Strategy with LoRA Models

47:00 – Companies Aren’t Ready to Train Custom Models Yet

48:30 – Santiago’s Current AI Stack: Cursor, Windsurf, Tracer, and More

50:30 – Specialized Tools for GitHub Issues, Jupyter Notebooks

52:20 – Fine-Tuning Tools & Lightweight Model Training (DPO, RLHF)

54:00 – Ask Danny, Leon & Santiago: How to Get Started in AI

55:00 – First Project Ideas: RAG with YouTube, Building MCP Servers

58:00 – Easy Agent Idea: Email Filter for Promotions

59:00 – Leon’s Learning Stack: FastAI, MCP Docs, Replicate

1:00:00 – Danny’s Favorite Beginner Project: Alt Text Generator

1:00:25 – Wrap-Up and Goodbye

Guest appearance

No items found.

Transcript

So people claim that AI will completely replace software developers. I highly doubt it. But also, I'm now on the side of, well, we really need to learn how to use this to our advantage.

Say, I want to build this, build me this prototype. Oh, I have 200 lines of code. I'm not going to just deploy those lines of code.

I just have everything that I needed that before used to cost me, I don't know, maybe two or three hours of research. I believe you can build reliable agents to do a lot of useful stuff. With the tech that we have today, we have to keep working on all of the tech that's around the LLN, right?

The guardrails, the security, stuff that we've been building for years. We have to bring it into this new field to augment that LLN with all of that so we can actually build reliable agents. Some of these big personalities who've said, yeah, we're going to be replacing developers.

We're going to be, you know, when you dig a little bit deeper, it turns out there is a lot of color in that commentary. There is a lot of nuance in that commentary. What's going on, everyone?

We are back with another one. Welcome to another episode of the Programming Podcast. I'm really geeked out for this one today in particular and so um the overall idea with this one is this is going to be your ai cheat sheet for everything that's happening right now and i was very very selective on who i tapped into and talked to and got for this episode and we had scheduling conflicts and i was like i'm not going to do this for anybody else for us i need to bring this person on episode.

And we had scheduling conflicts and I was like, I'm not going to do this with anybody else for us. I need to bring this person on in particular. I've had the luxury to know him now for like four years, five years, something like that.

We've been talking pretty much always on social media. Yeah. And so let's run into the intros.

I'm one of your co-hosts. My name is Danny Thompson. I'm the director of technology at a company called Dis.Labs.

And I'm one of your co-hosts, Leon Newell, manager of engineering at Resilient Coders and community member at 100 Devs. And I'm the new co-host. My name is Santiago and I am a machine learning engineer and I am the teacher of the machine learning school cohort.

we jump into this editor i'm going to need you to cut to a previous episode in a particular clip that i'm going to send you of my good buddy leon noel being a doubter of the ai saying quote what am i missing cut grab all right all right all right come on now. All right. So I was reading your blog posts and I felt like you were speaking directly to me.

I was the person as of October last year that was smug, that would always advocate folks knowing the tools and software they use. Deep and core of my being was instructing my students in that way. deep and core of my being was instructing my students in that way.

I was the person who played with it for 10 minutes and didn't get it and therefore kind of abandoned it in the beginning.

And so I felt like you were speaking directly to me with your blog post and a couple of your recent videos. But I think that had a lot of purpose. And I think you see something that not everyone sees yet.

And that's why I'm super excited to have you on the podcast today. You talked about the idea of like AI won't take your job, but a person using AI will. What are you seeing play out?

What do we need to be prepared for?

And then I know we can talk about a lot of other specific things too. I would say the most important thing to keep in mind is that I wrote that blog post thinking about myself. So whatever you were in October last year, I was there as well.

So I came to this field with many years of professional experience and something new comes up, which is this AI for coding thing. And it's me looking at something and say, there's no way that thing will do the same thing as I do. Okay.

So that's me three years ago. right?

But I'm going to have to stand corrected. And bit by bit, that thing started taking more off my plate. Okay?

To a point where it can actually replace me?

I don't think so. At least not to a full extent. But what started three years ago, just being useful for certain tasks, now is useful for way more complex tasks.

And it will continue to do so. To what extent?

Nobody's sure. Nobody knows. So people claim that AI will completely replace software developers.

I highly doubt it. But also, I'm now on the side of, well, we really need to learn how to use these to our advantage. Like if you're a professional developer, there are two modes that you can be in.

Mode number one is, oh, I don't believe in that. And everyone using it is doomed to fail. And this is a fad and it's going to go away next year.

And I'm going to tell you, ah, good luck with that. Mode number two is where you say, I'm a professional developer. I have a very good foundation to use that tool to completely increase my output here in many, many different ways.

So that would be my main recommendation for people. This thing is going to continue getting better. I would really, really, really need to embrace it.

That would be it. You know, not for nothing, though, I think that take is fair with that you had in the beginning. And I think everyone is of like equal mindset.

I definitely was. You know, I didn't know where we were going with it. I didn't know what was possible.

I could see some of the possibilities, but I didn't see anything that I thought was super, super overly impressive. And the people that I saw going gung-ho about it, I'm like, man, you're hyping up something that's just not there. I no longer feel that take is accurate anymore.

As far as replacing developers, no, I'm not even there. But it's funny because my mind has gone into a very different place where I've seen something that I believe is possible that I thought was only possible for a few before this. And that's the ability to be a builder at scale without having the budget of scale and the team of scale and i feel like now more than ever before people will have the opportunity if you can't get a job in tech for whatever reason that's not your only main path of growth and financial security within the field of tech anymore like of course there was always the ability to be like a startup founder or like to do like freelancing or something like that.

This is like on a totally different scale where you could freelance and act as if you have a team by yourself where you're literally by yourself. Or if you wanted to like build a solution, you have the power of multiple engineers doing tasks. And when I think about AI productivity, I feel like I'm in like a different conversation with people at times because some people are like, well, I don't want AI writing all my code.

I was like, cool. But like when you're on a team, you have somebody else working with you that's writing code, right?

And I think a lot of people when they're thinking about AI doing something, they're like, I'm just giving it a command. I'm not auditing it. I'm not building it.

I'm not doing anything. I got in this conversation very recently. that's why it's top of mind for me but for me i'm often doing something in tandem so like maybe i'm giving ai hey go update all the labels in my code base i don't want to do that what's the better word stuff i don't want to do that stuff like i want to do something where it's like let me build this logic let me build this solution let me type into this like whole algorithm that I'm trying to build out I'd rather do that or let me work on the back end I hate making buttons I've already got a style library for the buttons hey I AI go read my style library now build a button for this and then I can focus on this other thing I'm getting twice the amount of work done but I can focus on the thing that I actually want to focus on let it do its thing for 15 minutes, look back at the chat, give it another prompt, and then go back to my thing for 15 minutes.

Like it lets me get so much more done. But now when I'm starting to think of and look at other solutions, like there's so many things I can tap into that number one, just helps me exponentially within the organization. But number two, it speeds me up as a developer in general.

And so I want to jump into a bunch of that, but I love your thoughts around what I just said. Hey, listen, one of the things as a developer, I write a lot of code. And as a developer, one of the things that it's really hard for me is to break inertia.

So I have an idea, I have something to build. That initial phase where I have to do the research, what library should I use?

How do I start using this library?

That initial code prototype, it takes me a long time to have it done. After all of that is settled, after I'm using, I know what the documentation is, I build the first hello world, now things start moving, okay?

So that's always been the case for me. Guess what I'm doing right now?

That initial momentum, I get it with AI. Say, I want to build this thing, just build me this prototype. Boom, I have 200 lines of code.

I'm not gonna just deploy those lines of code. I'm not going to just deploy those lines of code. I just have everything that I needed that before used to cost me, I don't know, maybe two, three hours of research, right?

So now I can start building right away off of a prototype that AI built for me. And I can start just making modifications, changing this function or changing that, or trying to understand why I did that. That alone, if you give me that as a feature and ask me to pay for it, how much is it going to be worth to me when I can save three hours every time that I want to build something?

Does that make sense?

Just if you use it just for that, which is just a small thing that AI can do for me. But just for that is extremely worth it for me. Yeah.

I'd like the way that, one, that's amazing, right?

So many times I'm developing something, the first step is like the initial prototype. How many initial prototypes do I do to explain concepts, figure out what I'm going to do?

How many initial prototypes do I do to explain concepts, figure out what I'm going to do?

I think I really enjoy how you've broken down this kind of current wave into like three buckets. You said like learning how to use AI as a co-pilot and then learning how to co-pilot for the AI. And then you said building toolings with AI.

And so I think a lot of the hesitation that I'm seeing that was initially in myself and from others that are in the space is they're kind of thinking about that second bucket where you're the co-pilot for the AI. AI is doing everything. AI is doing the entire building.

Whereas kind of what you just mentioned is still in that beginning, the AI is doing the initial heavy lifting, but it's still you using it as a co-pilot to get your ideas out there, to get that momentum. And right now, there's a lot of great tooling for that. There's a lot of ways that you can get that co-pilot piece to kind of augment what you're doing.

And then I think the next big frontier is that sitting back, having it auto-generate a lot more than we're currently maybe comfortable with, and then all that tooling that's happening as well. So I'm kind of curious how you see the current breakdown. And I would love to hear more about your process because even what you just described is really kind of key.

How do you see AI like being a co-pilot for your day-to-day activities?

How do you, when do you step back and let AI do the heavy lifting where you're now the co-pilot?

And then when do you bring in the tools?

Yeah, a hundred percent. So just to set it aside, the tool part, what I'm thinking there is things like NCP or A2A, which was, you know, published by Google. And this is an entire field where we're going to start building things for AI to get better at something, or basically building capabilities for AI, right?

And you can get as deep as thinking, what's going to happen with e-commerce going forward?

E-commerce right now, or advertising in general, is we build or we create ads for people to read and take action on. Fast forward a few years, ads will be for other AI systems. So instead of putting a beautiful face with a person showing off, I don't know, whatever product they want to sell, we're going to have to create a language for other AIs so that AI, which is acting on your behalf, knows that that's a product that you would be interested in.

Does that make sense?

So a whole thing, that tool, that world of tools and communication between AI agents, that's coming, that's happening. And a lot of people are going to make a lot of money building on that realm. And we need software developers using AI to build tools for AI, right?

Now let's get to the other two modes. The way I see them, or at least the way I experience them is it's uh i go from one mode to another mode back to one mode okay so fast forward in the future i think a big part of what we're missing that we're gonna see getting better is ux for by coders and i'm calling them by coders i don't like the term but basically for people who are going to be using these tools to build software, not to write code. Like they don't care about the code.

Right now, code, it's this intermediate weird step, right?

But what they really care about is just building solutions on the core. The code is just going to be something behind the scenes that nobody cares about. Those tools, that is not Cursor.

That's just, you know, for my wife, she's not technical. She wants to build something to do the finances for my house. She does not care about the code.

Cursor is not a great tool for that. So those tools will have to improve for that regard. Now, for people like me, I'm technical.

I do care a lot about the code right now. The way it happens to me is I move from one mode to the other mode. When I'm starting to build a prototype, I'm in the mode of me as a co-pilot.

I'm just asking. I'm doing very little here. I'm just asking the model or the tool, whatever it is, let's say Scores or Windsor.

I'm asking the tool, build this for me. My focus is on how I ask for what I want. How many details do I give the tool about what I want so I can get a good result out of the tool?

But that's my focus, not in the code, but in the prompt. After the tool provides that code to me, now I get into the other mode. Now I become the one driving here and just using AI to help me clean the code that I wrote in the first place.

Now I'm checking that everything is working. Now I'm making decisions. Okay, so I like this function.

I want to write a couple of tests for this function. I'm going to use AI just to write those tests. So I'm going to go back and forth from both modes where I'm driving, where I let AI drive, right?

I think that, you know, as time goes on and as these tools improve, as the UX of these tools improve, you're going to see a lot of people spending more time into co-piloting AI, but letting AI do more of this stuff. We'll have to see how far we can get into that, you know, on that direction. But that's how i feel by now does that make sense that was a lot yeah it was a mouthful no i think that was really good and it kind of like lays into maybe like this next territory of the quote-unquote cheat sheet aspects like let's start let's assume that there's going to be some people listening to this that know a lot about ai let's also assume that there's going to be a portion of people listening to this that know a lot about AI.

Let's also assume that there's going to be a portion of people listening to this that have no clue about it. And prime example, like our listenership to, you know, I'm very grateful for it, but it's all over the place. Like we have junior devs writing the first line of codes.

And then we have people with 20 plus years of experience in the industry submitting questions, trying to get help, which is great. I love it. And so it now makes me think of like, how do we cater our answers to everyone?

So that way they find value. So let's keep both groups or multiple groups in mind rather, but like, let's start like defining some of this. And I'll let you kind of, obviously as, you know, our new co-host for the day, like start us off a little bit and let's start going down like the rabbit hole of things and i was actually going to ask one question separately first but i feel like you need to ask a different one in order to set it up and so let's start off with vector databasing why is this a thing what does it do why do we have to even know this term if we want to get involved with like building ai related items right so well the whole idea of AI, just if you wanted to go to one of the fundamental concepts that makes all of this possible is the idea of an embedding.

Okay. So that's the word. We call it embedding.

And embedding is just a vector of numbers that represents a concept, okay, in the world. So imagine that I have a can in my hand, which I do, and I want to represent the concept of a drink, of a soft drink. There is going to be, right, like a Coke, like a Diet Coke, there's going to be a set of numbers that in this world universe of ideas, that set of numbers is like a coordinate for that universe that's going to be represented by this camp.

What's going to be inside that or in that position is going to be the camp. But that's what we call an embedding, okay?

So if we have an action like me drinking a camp, that would be another set of numbers. Or you have a dog or a cat or a table. All of those have these ideas of an embed, okay?

So those embeddings get created by a model. We have an embedding model and we pass information into that embedding model and the embedding model outputs those coordinates. Those are the coordinates that we store in a vector database.

As the name says, vector database is just a database of vectors, a bunch of numbers. And one characteristic of a vector database is that it allows to find similar vectors really, really quick. And why is this important?

Because if I wanted to find, let's say I take now a picture of a different can. See, I have two cans. They're different.

They represent the same concept. They will output similar but different vectors. Okay.

But they're going to be similar. Now, a vector is a set of numbers. And it's not obvious, like for us, like people, it's not obvious how you will tell that this set of numbers is similar to this other set of numbers.

We need just a formula. One that we use is cosine similarity. It's just a formula.

You input set of numbers and it tells you how close those two vectors are. Well, a vector database is going to let us do that really, really fast. So that means that if I take the set of numbers of this GAN and I ask a vector database, give me back anything that's similar to this, the vector database is going to return the embedding for this GAN here.

Why is any of this useful?

Well, now you can do things like retrieval augmented generation or RAC very, very quickly. How does RAC work?

What is the idea of RAC?

You have a model and you want the model to answer a question. You have knowledge to provide to that large language model because then the model doesn't know that information. Let's say you want the model to summarize a YouTube video.

Okay, that's what you want the model to do. You have the entire script of the YouTube video and you want to give it to the model. So the model summarizes that and maybe answers a question about that summary.

So you have the knowledge, which is the YouTube video. The model doesn't know about the YouTube video. You can provide it to the model.

But the model has limited context, meaning you cannot just provide a two-hour YouTube video to that model because it doesn't fit. The model cannot process all of that. So you only have to give the model the amount or the passages of that video, the portions of the video that are relevant to answer the questions that you're asking that model to answer.

Okay. So let's say you're going to ask the model about a math question and there is a portion of the video where we talk about that math. Okay.

You want to provide that portion to the model and say, answer this question using this context. How do you know that's the portion?

How do you, starting with the script, how do you know where to find the conversation about the math?

That's what embeddings and relevant and similarity comes in. Because if you take the whole script of the video, you chunk it out, and then you get different passages, and you generate embeddings for each one of those passages, and you store all of those embeddings in a vector database, when somebody asks a question, you can take that question, generate an embedding for that question, and ask the vector database, give me anything that's similar to this question. So any passages from the video that are similar to this math question are going to come back.

You're going to get those and give them to the model to answer the question. And I know there is a lot of pieces moving here, but I just want to summarize it. Just similarity with these vectors of numbers is a huge, huge, big component of how you can make these big systems work and ground these models.

Grounding a model means models hallucinate. If you ask questions of these models without providing context, there is a higher chance that they're going to just make something up. That's why it's always a good idea, assuming that you have the knowledge, to just give them that knowledge.

And the vector database is behind all of this. And then long, long answer. But yeah, I know that's a great answer.

How relevant is all this now that the context windows are getting larger?

Right. People are talking about a million. Yeah.

So is this still something that we're going to be using a few years from now?

Does that have an impact?

So number one question is great. Now the context is getting like really, really large. This is less relevant, but still very useful.

So one of the problems that large context brings is that models do not pay attention to all of the context in the same way. What's even more interesting, different models focused on different areas of the context in a different way. So, for example, you might have model A, which is better at understanding and paying attention at the beginning of the context window.

So whatever you put in front, model A is going to be better at understanding and paying attention at the beginning of the context window. So whatever you put in front, model A is going to be better at. But it's going to tend to ignore things that are in the middle of the context window.

But model B, however, is better at what you say at the end of the context window.

That's how humans work. Like we tend to remember the last thing you said from a conversation. We tend to forget the first thing you said from a conversation.

So different models act differently. The larger the context is, the harder it is for the model to remember it all. So depending on your use case, maybe the context, the entire context that you are working with, you can fit it in the window and it's small enough that your model can just extract all of the information from it.

You don't need a rack system for that. But if you're working for a company, imagine you're working for Bank of America, that they have a humongous knowledge base with articles and FAQs and etc. A larger context window might solve the problem technically, meaning you don't need a rack system because you can fit it all in two million tokens, whatever the context is.

But the results are not going to be great if you just go the context way, at least not for now. Is this going to be solved?

Maybe. But we don't know that yet. So that's why rack is still relevant.

So I want to add one thing here. And if you did mention this, I apologize because I don't feel like I heard it. And so I want to add in like a couple points.

I feel like we've touched a lot in a very short period of time. So maybe it's, and it's super easy to skip things, especially because we're trying to cover such a breadth. When it comes to vector databasing, I agree, by the way, slightly less relevant than it was a year ago, but we're talking in slight to the point where you can't ignore it in my year ago, but we're talking in slight to the point where you can't ignore it in my opinion.

And it's for a couple of big specific reasons. Embedding, I'll give you the example that I gave to a team of like college kids when I was explaining this to them because they were kind of confused by the concept. And so if you, I love the example of the cans, by the way, I thought that was really good.

So I give a text example. So I say, if you were to type in like spicy chicken sandwich flavoring, right?

I'm big on flavor. You probably tell by themselves. But the thing here is it breaks it down verbiage-wise.

So spicy is going to be an item that it's broken down into numeric values, savory, recipe, chicken, et cetera, right?

It's going to break all that verbiage down. Now, one thing that I want you to think of is it now converted that word into numbers and it's a numerical value. And it indexes things for the purpose of speed, right?

This is a big factor here, especially when we're talking millions and millions of lines. So I'll just ask you just out of curiosity, Leon, when you think about about indexing something for speed what do you think of like sorting in that regard so what your is i was asking more leon anything since he's not ai inclined just to get his perspective i'm curious if somebody listening has a similar idea when you index something for speed what would you sort it as would it be like alphabetically or like things like that?

I know, I know, I know. So the idea... I do not remember, I read the paper.

So my recommendation would be to look at a FISE paper and that is spelled F-A-I-S-S paper. Okay, so FISE paper, that paper explains how that information is stored so you can actually retrieve that information really, really fast from a vector database. And actually, one of the algorithms that when you start building for AI, that's the most common algorithm, and that's what you should do.

So it's an entire algorithm. It's not just, hey, let's sort. That's why you need a specialized, not necessarily a specialized database, but specialized features to store these vectors.

By the way, when we started, vector databases were really, really popular. Now, every major database that you know of, Postgres, MySQL, all of them support vector tables. Yeah.

You know, because they incorporated the functionality. So yes, the F-I-S-S paper explains, or you can just Google it. You don't have to read the paper and read how the algorithm works.

They will explain how they do the organization. Right. Now they have clusters and, They have clusters and collisions, and they move one thing from one cluster to another.

There are collisions. So it's a fairly complicated but smart way of organizing the data. I was leading into that, man.

You took my thunder a little bit. No, clustering is the traditional way that I have utilized it, seen it, and experienced it with it. And so the example that I give in relation to that spicy chicken sandwich item is if you were now sorting your spice rack in order to make this sandwich, you'd have like warm spices together, like cinnamon, nutmeg, or hot chilies, like chili, cayenne, what's it called, gochujang from like Korean peppers and things like that.

It would cluster them together because now if you were to traditionally search through like millions of lines of code or like lines of information it would take such a long period of time but by that clustering of similarities and by determining the translation of that text of like spicy it can now put it in a similar context it now takes fractions of milliseconds in order to get that response back. So you're getting such elaborate replies. And when you tie that now into RAG, for example, RAG is the retrieval aspect.

And Santiago covered a lot of that, so I'm not going to even touch on that. But the one aspect I will touch on is the hopeful reduction of hallucinations. So when you're retrieving these documents, right, part of it is retrieving the query, the knowledge and information by itself.

The other part is now that it can augment this context and this response because it no longer has to like pull it first, read it, et cetera, because it already has that context. And so now the LLM by itself is generating its answer based on that having the accuracy. Like it's already defaulting.

It's in here. We know it's accurate. We're no longer thinking that it's hallucination.

So it's going off of it by default in order to build your responses. So especially at like a corporate level, enterprise level, and you have like millions and millions of responses, the cut down time on that alone is massive. And that's a big factor for it in like Fortune 500 or something.

Yeah, no, I mean, 100% companies are building rack applications, let them write. Oh, yeah. I think right now that the tech is very good.

Where I've seen the most focused on is on processing or the ingestion phase, which is basically you have the documents. So everyone skips this when they're talking about rock. They say, well, you have the documents and you generate embeddings and they move on really quick.

What would I say?

But those documents are not that simple. Those documents have tables and images and diagrams that you have to somehow understand like a flow diagram, how you go through all of these steps. So a lot of companies are focusing right now on how to ingest that information, process that information to generate good embeddings.

So those embeddings then can serve the RAC application well. Because if your vector database contains crap, obviously your results are going to be crap. So that ingestion phase is where a lot of the focus is right now.

We talked a lot of vector, of course, databases. We talked about RAC. Are there any other big things right now that our listeners should know about or be aware of?

MCP. Yeah, MCP, of course. Are there any other things that are kind of like top of mind for you right now?

I mean, MCP is one of the, obviously the big ones or this idea of building agents. And nobody knows what an agent is because nobody has to find it. And every single company defines agents in a different way.

I think I have here, this is a document that I have here in my hands from OpenAI. It's public. I think they published it a couple of weeks ago.

And they define an agent by saying, agents are systems that independently accomplish tasks on your behalf. That's it. So it's a system.

Wow. It's capable of doing something on your behalf. And by the way, this, this definition is a little bit different from the one from Anthropic, exactly what I was about to say.

Right. So everyone is defining these systems in a different way, but why is that?

The way I like about agencies, it's just a piece of software that's capable of doing something on your behalf automatically, right?

Like you give them instructions and this thing is just going to go out there and just do something for you. This is where a lot of people, including me, think the future is going to go. Right now, agents are like robots.

Like they're starting, you're starting to see the value and some of them kind of work to a point where you can record a great video and you see them jumping and moving. But they're not necessarily trustworthy yet, at least not the complex ones, right?

So for doing little things, sure, for like doing complex stuff, you record the video, you're not sure they are really there yet, but they will become better because this is an engineering problem right now, right?

So the LLMs, even if the LLMs do not get better, even if we just take the models as they are today, I believe you can build reliable agents to do a lot of useful stuff. With the tech that we have today, we have to keep working on all of the tech that's around the LLM, right?

The guardrails, the security, stuff that we've been building for years. the security stuff that we've been building for years, we have to bring it into this new field to augment that LLM with all of that so we can actually build reliable agents. But we're going to get there.

And that is pretty exciting for sure. Yeah. Why do you think it's we're out of state?

Because the way I even define agents is different. Why is there no centralized definition that we can all agree upon?

It seems like, because you said basically what I was going to say, anthropics definition is different. The way that I even define agent is it can be given a task and iterate over that task as many times as it deems necessary until it feels it's satisfied the request or the goal. I feel like that's still accurate.

Why is there no universal definition that everyone agrees upon?

Just reading the OpenAI one, if you read the OpenAI one, I will argue that just a regular workflow with four different steps, no AI involved, would suffice this definition exactly definition and that's not an agent that that's just a workflow it's a rule-based workflow that i create and i can loop in a workflow i can ask is the is the you know the my objective accomplished no just go back i mean we invented that a long time ago. So I guess, you know, OpenAI is trying to be as general as possible here. Some other people are asking, no, no, no, you have to just sort of like narrow that down a little bit.

So it's just AI agents where you're talking. I don't know, it's just definitions. The browser war is all over again.

We're all fighting for a little bit of capital. Right. You know, there are people that, there are people that are going to fight those battles.

I'm very, very pragmatic. I don't care about how you name things or what I just care about. Is the thing working?

Can I build it?

That's what I care about. I'm going to let them just define, make the definitions for me, for sure. Yeah.

Let me ask you this. What AI developments do you think, because I have my opinions that are like completely overhyped in the community right now but what what aspects do you feel are also like maybe like dangerously underrepresented like people aren't it's not even on their radar when it should be you mean but it's specifically about what i mean just ai in general like what are some you mean so yeah, I think biggest area that's been hyped right now, and there is a good reason why it's been hyped, is that replacement of code, right, as a thing, right?

Like you get the OpenAI CEO who's been on record for a couple of times saying, oh, yeah, no, development is just, it's going to die. Or the NVIDIA CEO, he said, yeah, don't learn to code. Or the Anthropic CEO who said in six months, we're going to be replacing all of the developers.

What's funny is that all of them have an incentive to say that. You have to remember that, right?

So incentives are aligned with them saying yeah yeah don't learn to code then you get other people like the github ceo i think he he said last week or maybe earlier this week you should actually learn to code and a bunch of other people saying yes you should actually so you get these two opposite poles uh so i think that's overhyped by the, some of these big personalities who've said, yeah, we're going to be replacing developers, or we're going to be, you know. When you dig a little bit deeper, it turns out there is a lot of color in that commentary. There is a lot of nuance in that commentary.

They sort of like backtrack and say, no, no, no. What we mean, in that commentary, they sort of like backtrack and say, no, no, no, what we mean, what we mean is like competitive programming. Well, okay.

So competitive programming has nothing to do with what engineers do at a company, nothing to do. We're not even talking about the same ballpark. Okay.

So that's completely different. So anyway, so I think that's very, completely different. So anyway, so I think that's very, very high.

An area that I think people are not paying enough attention to. Ah, man, I'm going to pass on that one. I'm not sure.

Nothing comes to mind right now. I think what we have is an overhype of the field. So it's really hard for me to just find an area that is not being hyped to death.

But yeah, I mean, if you ask me, hey, what are the things that are truly, truly you believing are going to?

Again, it's the tooling portion, how to build tools. Agents, obviously, I think is just a big thing. I'm on the fence on RAC because I believe that even though the concept, the idea is very useful, I believe the value is going to diminish, not disappear.

We still need good mechanisms to ingest information that's all over the place. But I do believe that as models get better, we're not going to have to be chopping off concepts like small chunks and doing all of the dance for RAD to work. But yeah, MCP, A2A, which is the agent-to-agent communication, I think those two are going to become bigger.

Building agents in general, that will become really, really big in all regards. We have no idea how to do this well yet. We're just starting by building small things that do things.

These are going to grow. Think of microservices, okay?

How the whole field sort like exploded and everyone was creating all of these complex microservices architectures to just communicate and build big systems. The same thing is going to happen with agents where now companies are going to be deploying agents that can't work with the agents of other companies. And you can have strike with an agent that's going to, I don't know, record the transaction and you can connect that agent with Slack just to, I don't know, record the transaction and you can connect that agent with Slack just to post on Slack when a transaction comes in.

And you can connect that to an agent that's going to not see what I'm going with this. And you can have all of these or this network of agents working together to accomplish a task. That's going to be really cool.

So I think that's an area that will continue to grow. Yeah. I will tell you, so as of the time of recording this last week, or it was like actually the week before last, man, time flies.

The week before last, I was in New York for DocuScience Conference Momentum. Shout out to This.Labs because we built something basically like that to the level with contract agreements. And so there are some organizations, this blew me away, I didn't know this, there are some organizations out there that are using millions of different versions of a form, millions.

Like I talked to one organization, they say we have over 3 million versions of a form. Insurance, financial, et cetera, et cetera, et cetera, right?

And so we built a three-way integration influenced by ai but it's using docusigns iam right and the idea was you have uh let's say you send somebody an agreement you obviously get the email from docusigns saying oh we've got this agreement here right the other side of it we built the top six apps in their app center that is using their ai integrations with docusSign, Iris, et cetera, et cetera, where number one, we have an Asana dashboard that showcases the status of that document. Is it pending?

Has it been signed?

Is it completed?

Et cetera, et cetera, et cetera. So once you sign that agreement in real time, Asana will update that dashboard for you. It then ties into Slack and it'll send the message saying, hey, the document's been sent to XYZ user.

They haven't signed it yet. Next step, the user's been signed and it tags in the Slack message the key account holder or the leader of that team because sometimes we miss those email notifications. And then from there, third-way integration, it now ties into like a MailChimp drip feed.

So it's like, like hey welcome to our program whatever you wanted to say but through all those steps previously each step required human intervention now none of it requires somebody stepping into it so that way they don't have to manually trigger each flow for that to happen that's the kind of stuff that i geek out about um when it comes to like um under hype things honestly my mind has been really like stuck on um like one area particularly and i just don't know why it's not getting um like more attention uh or at least if it is i definitely haven't seen it um and i would love to see because i'd love to geek out about it but like tiny mlai like models and like edge intelligence like bringing like machine learning and algorithms to like microcontrollers for like embedded devices that's i've always liked iot in general but like this thing i feel like there's like a massive world of potential for use cases like some of it like the one that i was really thinking about and it's, it sparked from a conversation with family friends, you know, we got a kid, you end up getting friends with kids and you talk about stuff. And one of them just had a baby. And I was like, damn, what if you had like an AI tooling for like, um, uh, you, because this saved us when we were parents, like those baby swings, it would like rock them to sleep in the sun.

But what if you had like an AI solution where it monitored the baby's movement, but also maybe rapid eye movement or something like that, to where it's like, oh, it notices the baby's probably getting out of a REM cycle, may come up. Maybe it increases the swing a little bit more. And then as it knows it's deep in a REM cycle, maybe it go ahead and like lowers it based on several determinations.

deep in the REM cycle, maybe it go ahead and like lowers it based on several determinations. I feel like that is like such a next level aspect to childcare that could affect like daily lives. And I feel like that's severely underhyped.

And I don't know why. This is something that, you know, I'm really excited about. It's called Laura.

And it's the idea of Laura is for you. Imagine you get these eight models. They're huge.

We cannot just install three of these models in our phones, right?

Just to think. But you can have one model. Now, that model is usually a generalist, meaning we trained that model with general information about the world.

It's not a specialized model, which is what you're looking for right now. You're looking for a specialized model that knows how to control that AP swing. But here is the good thing.

Taking that big model, if you wanted to just go down the path of retraining that model for every new use case, that will produce many big models. And now again, we go back to if they don't fit on my phone. But using LoRa, you can actually take a small portion of that model without getting into the technicalities of it.

You can create something that's called an adapter and just train that adapter. And that adapter plus the big model, if you put them together, if you plug that adapter, that big model now transforms into what you want that model to do. So what this means is that for a phone, you could have one adapter that's been trained on summarization, another adapter that's been trained on grammar, another adapter that's been trained on grammar.

Another adapter that's been trained on reading and understanding your email. Another adapter that's been trained on photo processing. So you have five, six, seven, ten, a hundred adapters.

And as you're going to use them in your phone, you can just plug them in with the big model. And now the big model is capable of that specific use case. That is huge.

That is the way Apple, supposedly based on their blog posts, not based on their execution, because their execution has been horrible so far, but based on the blog posts, this is what they did for their Apple intelligence, where they can have this big model trained by them, supposedly, and just plug and play different adapters to have that big model do different things. That is coming. I mean, it's coming for Apple intelligence, and we're going to see more of it going forward.

Most of the industry, however, remember, most of the industry is focused on how can we create a bigger, better, not bigger right now.

I think we're giving away on that idea. We realize that bigger is not longer better. So I think for the size, we're good, but a better generalist model.

And big, you know, big AI is letting companies take on the, well, you specialize your modeling, your use case, obviously, they're not going to do that for us. But we have a people problem. Meaning when you go to companies, how many companies do you know that have people with the skills to actually build good specialized models.

Not many, right?

Like DocuSign, sure, that's a big company, right?

But when you go down there, I mean, I know companies that they don't even have the skills to build websites, let alone machine learning models, right?

So as that knowledge starts trickling down, and as we get better tooling to do this and it becomes easier and we get more people trained you're going to start seeing specialized solutions like that one and that those are going to be game changer for sure i think that's a really interesting kind of point there as it comes easy and easier to utilize i've been playing around with replicate a lot and so i can do a laurel with a laura with flux and cost me a dollar and i was able to fine-tune uh this flux model to do exactly what i wanted and it just made it very approachable i'm very curious what your stack looks like these days like what what is your AI tool stack?

Like what are you using to do everything from just like your day-to-day coding, but also if you wanted to do any of this, like LoRa stuff or fine tuning stuff, what does that look like for you on a day-to-day basis too?

Yeah, great question. So myself, I'm most of the time using Cursor, but I do use a lot of specialized extensions. What's funny is that I always find these extensions because the companies reach out to me or to sponsor them or whatnot.

And I get to try their tools and I'm like, aha, I really like that. So I keep using it. Right.

So things like Tracer, things like CodeRabbit, things like right now I'm testing one. I cannot disclose what that is, but that is, so I can disclose what it does. I cannot disclose the name.

It's a tool for data scientists specifically to manage Jupyter notebooks. And that's something that I do a lot. And this, again, this is a specialized tool.

Nobody has one like this one. It's just for Jupyter notebooks. And that's something that I do a lot.

And this, again, this is a specialized tool. Nobody has one like this one. It's just for Jupyter notebooks.

So the data scientists would be the main, obviously the main user base. So this goes beyond what Cursor or Windsurf or our co-pilot do, because those are going to be more general towards developers. Tracer does something that's very, very good, which is an integration with GitHub issues, which I use a lot.

What that means is that you go to GitHub issues, you have your issues, people are reporting problems with your code base, and you go there and you immediately have a full integration from the issue in GitHub with your code. So as soon as, let's say, Leon, you go to my source code and you find a bug and you write down the bug in GitHub issues, immediately after that, there is a bug that's automatically going to provide documentation to solve the issue, a resolution plan to solve the issue, the steps, a diagram, an activity diagram of what's going to happen. So all of that's going to happen automatically without me intervening at all, right?

So you just wrote the issue and you're going to get that resolution right there. And then I can click a button and have that entire history of comments from the issue, the resolution, the plan, all of that imported into my Visual Studio Code or Cursor or Windsurf. And immediately click a button and let that bot just follow the plan that you created and solve that.

So those are very specialized tools that I've used. But again, the foundation of everything, it's, I would say 80% Cursor. The other 20% will be either visual studio code or windsurf i'm liking windsurf a lot so your mileage may vary some people say oh windsurf is better some people say cursor is better i think staying in one of them is better and that's what i actually think i don't like to be switching.

Actually, the reason I have this 20% separate on those tools is because literally at the time, I have four or five different projects and having them separate on tools is easy for me to find them in my desktop. You might think I'm stupid, but that's the way my brain works. So if I want to, let's say my website, right?

I have my website open in Windsor. And if I want to, let's say my website, right?

I have my website open in Windsor. And whenever I want to make a change on my website, I know that I just need to go in my Mac and find Windsor and my website is there. But my class I have in Courser.

So whenever I want to make changes to my class, to the code base of my class, I go to Courser. So that's mainly why I use them differently. Any of them are very good.

The difference is, to be honest with you, now we're splitting hairs here and some of them do something better. Just get to know your tool, get good at it. I promise that extra difference, most of us are not in a position where we can actually take advantage of that extra difference.

You know what I'm saying?

It's like, eh, that's, it's just, doesn't matter. So yeah, that's what I use. Oh, and for fine tuning and whatnot.

So number one, I would recommend people, if you're going to be using an open AI based model, they have a fine tuning website. Just use that. That's it.

Don't worry about it. Depending on what you want to do, there is another tool, for example, I think it's called... I'm sorry, I'm forgetting the name right now.

But what they do is they allow you to fine-tune a model, number one, open source models, and number two is a very specific way, super base, sub base. It's not the database. it's around those things.

And I apologize that I'm forgetting right now, but they allow you to train a model using a specific algorithms, okay?

So the reason this is important is because, for example, you do not need a lot of data to fine tune your model, okay?

Which is not the case when you just want to do regular fine-tuning. So if you want to use this method of fine-tuning, just using reinforcement learning. Oh, by the way, it's the method that was used by Dixit.

Remember when Dixit came out and everyone is like, holy moly, how good are they?

Well, they were really good. But on top of being really good, their paper, for the first time, listed a method that they used to train DPSIC, which is completely different from what we knew before that. So that was sort of like a breakthrough, a good thing.

They used reinforcement learning to fine-tune their model. They don't need that much data to fine-tune and get a very good model. And there are tools by now that allow you to use tune their model.

They don't need that much data to fine tune and get a very good model. And there are tools by now that allow you to use that specific method. So if you need that, you will use one of those tools.

And I apologize, I'm forgetting. Predibase. Predibase.

That's the name of Twitter. It's predibase.com. They will allow you to do this.

So again, it depends on what I'm trying to do. That's what I would use. Yeah.

Love this episode. To be honest, I think this is a very easy, we need to bring Santiago back and do another one. But let's do our Ask Danny, Leon, and Santiago.

But to be honest, given what we've talked about, I feel it's probably going to be very, very important to do a specialized question for this one. If somebody is, you know, I want to learn AI. Y'all convinced me this is the thing.

I want to start learning. Where should I learn?

What should I build?

And please don't tell me another chatbot. I'll let Santiago take this one off. It really depends on where are you starting.

Okay, so let's say you're going to be starting from i've never used ai i know how to code what should i do well my first recommendation is you want to start simple start with cursor download cursor start using their free plan the free models and start asking questions and asking cursor to build stuff for you You can also check one of the websites that are online, Replay, Bolt, Lovable. There are a bunch of competitors in that area that would allow you to build application. Just start, just become comfortable letting AI do stuff for you and evaluate based on that.

Just become comfortable with it. That would be my first recommendation. Now they say you want to start building something that's AI specific.

Like what should you do?

Well, if you want to learn how to build rack applications, just build a simple rack system that answers questions from YouTube, for example, right?

Like you want to watch this whole episode in youtube built a simple system that grabs this episode and answers questions about this episode who said this and that and the system should say oh danny was the one mentioning that or what did they say about ncp and your system should summarize and when you build that if you build that, all of a sudden you have something that's useful for you to answer questions about other YouTube videos that you're watching. So that would be one example of an application that you could build. If you wanted to learn MCP, just build a simple MCP server.

Something that answers, like for example, for my class, what I did. I wanted, so part of my class is MCP or when I'll be teaching MCP in session six, I wanted to build a simple MCP server to learn how to do this. So what I built was, okay, so I have an MCP server that's going to manage my deployments.

So before, whenever I wanted to deploy a model to AWS, I run a command to do that. Or, you know, it was a big command, why not?

Now I can just ask Courser to do that for me. I can tell Courser, hey, can you deploy this model?

And can you name it this way?

Or are there any deployments that are active?

And Courser knows immediately. Courser says, yes, you have a deployment that's active. How does that work?

There is an MCP server that I built and Courser communicates with that MCP server and checks my AWS account. So I created a much better interface for people who are going to be using that server through MCP. You can do that.

So build something like, by the way, super simple to build, do that so build something like by the way super simple tool to build build something like that uh yeah that's i mean or build a simple agent something that checks your email if you want to learn how to build agents right can you build a simple agent that checks your email account and tells you whenever you receive a message from your wife or your husband or a promotional message and now you have to sort of like process the message and determine whether a message is promotional or not. So things like this, which seem very small and dumb, maybe, they have a tendency of giving you tools to improve how, you know, your knowledge in that area, they will start building up, becoming better and better and giving you the skills that you need. Leon, any advice?

Yeah. I mean, as someone who went on the journey recently, the things that I think were the most helpful for me were one, Fast AI, University of Queensland course. It kind of just gave me the background that I was missing for a lot of like the fundamental stuff um i've done like andrew ning's stuff in the past before but the fast ai course really i thought like got me up to speed to understand the core concepts for playing around with stuff i really like replicate like i can get something up and running i can do very basic fine tuning it it really helped me go beyond the theory to actually using it.

For MCP, I think it just comes back to reading the docs, right?

Model context protocol IO has everything you need to get up and running. I was able to follow the docs and building an MCP server. I was talking to YouTube videos with it.

So, yeah, I think those are kind of the key things that have helped me as I kind of go down this route. Yeah, I'll say resources mentioned, I'm not going to go into it. If you really want to go another route, talk to Chad GPT and constantly iterate over it if you want to build something.

The one thing that I'll recommend, I've been recommending this to some college students, even some senior devs that are like just trying to get their feet wet. You know what's a cool thing to build that you wouldn't really think about that nobody really talks about build an alt text generator submit an application or submit an image let it read the image store it if you want to if you want to start digging down that path like vector databasing etc and just translate this into what it sees textually and like add it as alt text to an image i feel like just understanding how the api is working and do it great way to kind of get your feet wet if that's something you want to do and then you could kind of go run down the gamut after that but if it's like your very first app and you don't want to build a chat bot because i think everybody built a chat bot this is like a great way to kind of introduce yourself to the ideas of what um ai could potentially do for you all right y'all it's been real it's been fun we will see you on the next one goodbye everybody

Latest Episodes