OpenAI Releases Smartest AI Ever & How To Use It

OpenAI Releases Smartest AI Ever & How To Use It

Okay so opening ey made a big move they released a brand new model titled 01 this is the first model in a brand new series next to their gpts and what it specializes in is reasoning okay what does reasoning exactly mean well I would Define it as thinking about something for more than a few seconds this definition works really well in using cat GPT as you’ll see in a second because this new model takes a bit of a different approach we’ll talk that about that in a second but first things first how can you access.

this and what are the limitations let’s move over to the screen recording here so first things first this is available to all chat GPT plus and teams users as I’m on a teams account I already have access to 01 preview and 01 mini right here there’s big limitations on it though if you go through chat GPT 01 preview 30 messages per week okay that’s per week so be careful with using this 01 mini 50 messages per week.

The API access is unlimited but it only has been rolled out to people with $1,000 or more
of spent you need to be in the tier five category with openi so not everybody has the API access so those are the base facts now look let’s talk about this reasoning word capability okay because up until now we were familiar with models like gbt 40 and a lot of competitors that kind of do the same thing and if you tell it to write you an essay about penguins it will just do that without really thinking about what makes a good essay about penguins and it just does it now look with prompts like.

Introduction to OpenAI’s Latest Breakthrough

This I’m not saying the new model works
better I’m just saying it’s going to spend some time thinking about the answer before it gives it to you and this is nothing new anybody who’s been following the channel closely or following the prompt engineering or large language model space geni space closely over the past year or two will have ran into this technique uh called Chain of Thought and Chain of Thought is easily described as a different way of prompting that includes a little bit more reasoning a little bit more thinking so uh you would just add think

step by step and then you would get improved results on uh reasoning related tasks now.

What are reasoning related tasks let’s go a little deeper into this they’re tasks in the domains of science math and coding okay so something like this like writing an essay is not going to be improved this is not some magic bullet you don’t just switch over to this new model and it’s amazing but there’s a lot of things that it’s amazing at my hope is that by by the end of the video you will understand them a
little more closely um so as it is super efficient and and improved in these domains of and I I’ll have to briefly regenerate this as it is improved in the domains of of science coding and Mathematics what if you don’t work in science coding on mathematics does this even matter to you well I would make the case that if you could Master well or at least get to a decent level of understanding and application of math coding and science without any effort would it be useful to you I think that’s the real question to answer here because
initially people will be like oh I don’t need math in my everyday life I don’t need a math PhD my everyday life well but if you had one without any effort no cost would it be useful to you and that’s a question to ask here.

These are the tasks that you can now perform they claim it does PHD level mathematics and they have various benchmarks to bank back that up the most interesting one uh being this graph this comparison so as you can see here on competition map uh gp40 scored 13 questions uh 133% of the problems G
part 01 preview the model that we have right now in the web interface this one right here.

The preview model scored 56% and the full 01 model which is still not full not released this is coming up in the future 83.3% I’ll directly quote something from their blog article here in a qualifying exam for the international mathematics Olympiad IMO GPT 40 correctly solved only 133% of the problem problems that’s this right here while the reasoning model scored 83 okay so this is not a little change this is not an improvement
in reasoning and thinking this is a massive change and there’s also massive changes in how this processes your request because if you write an essay like.

The result might be the same but if you run this inside of the new model it will take a whole lot longer I prepared a few examples here in advance and as you can see this simple business plan here with a $2,000 budgets it took it 9 seconds to answer that if you run that through gp40 it just starts generating right away this was thinking for 9 seconds and then it started
generating because it actually goes ahead and creates a little bit of a plan multi-step reasoning this is really a big step in this agentic future that we’ve been talking about on this channel quite a bit because before you execute something you think about it right think about it just from a human perspective.

This is I think this is really the important Point here if you are told hey could you please help me prepare a business plan for a new t-shirt brand with the name and slogan Where is the voice assistant which by the way I I

still wonder but let’s say somebody tasked you with this you know creating a business plan for a new t-shirt brand with a $2,000 budget well this is not the type of question that you would just come up with an answer to like this this is the type of question where you would say sure I can do that uh let me get back to you when do by when do you need it right implying that there is multiple steps in thinking this through you you have to think about the marketing plan you have to think about the finances you
have to break it down what do we need do some research then think about that research let it SN in then come up with a business plan right but from GPT as our assistant this far we always expected it to just give us the answer right away and then we were disappointed when the answer was not so good and fair enough we’re early on in the development of this technology but this changes now and I also want to make the point that this is not just you know Chain of Thought built into it this is not just them saying things step
by step at the end of every prompt because if that was the case then we could have solved some of these harder problems early on some of these problems that require a little more reasoning and it just that that wasn’t the case.

Key Features of the Smartest AI Yet

I can give you a good example here I played around uh with it for a bit here this palindrome example is a really good one because the gp4 was just terrible at this uh palindrome if you’re not familiar is if the phrase is the same from front to back and back to front so here it just said create a palindrome
related to cats with hats and then it fought for 18 seconds with all of these different steps in between and then they gave me I saw a cat a hat a attch a was I now that doesn’t really you know make sense so I said okay but make it like a common sentence for common people and then it fought for 31 seconds and said okay here’s a pal drone that reads like a common sentence was it a cat I saw in front to back to front was it a cat I saw this is awesome but it excludes the Hat then it goes ahead and admits that
okay I couldn’t really fit the Hat in meaningfully but there you go this one incorporates both so it’s at least self-aware if you ask this um and throw this into GPT 40 you will be very disappointed because it will do something uh EV can I see bees in a cave yeah there you go this has nothing to do with.

What I asked right so that was that was initially very interesting and look this is not really coding science or mathematics right this is but this is also on the other hand this is not something that you would use in your
everyday life I would I would argue um now let’s talk about that here in a second like how can you actually use this model in your everyday life but before that I really do want to show you some of these change capabilities here because another interesting one is translation translation let me tell you as somebody who speaks free languages at the level of English it’s just something that has been you know granted to me and like my life has evolved in a way where I didn’t really have a choice about to
learn all three of them but I speak English Slovak and German in no particular order all at this level of both reading I suppose my Slovak writing is is way worse than my English but essentially I’m trilingual and let me tell you when you translate things back and forth there’s a lot of complexity involved and there’s a lot of context involved and certain things only make sense in a certain language and cannot really be translated word by word so there’s actually.

This one phrase from German that I just threw in here right
away and I was kind of Blown Away by the result because it’s a ridiculous phrase It’s a phrase where um people often make fun of German for having phrase like this phrases like this but German is is a bit crazy like that it has a lot of very peculiar phrases that are commonly used and German speakers don’t even realize that as long as you don’t Contex switch it to other languages but it’s this one it’s um it’s a what what this is what this literally trans translates to is that’s
when the dog goes crazy in the frying pan okay so it’s a it’s a completely nonsensical and weird uh phrase but it is it is a thing it is a phrase that is used in German and it’s hard to translate right but gbt 401 preview actually went in and started looking at idioms in Hindi and because it was curious to see if mixing literal and idiomatic translations works best and then it came up with well I’ll be darned and let me tell you this is perfect this is how this phrase is actually used in German now let me just see what we would
get from GPT 4 without doing these extra reasoning these extra thinking steps but my guess would be that this is not uh going to go as well that’s unbelievable that’s crazy almost almost uh but not quite let me tell you not quite I’ll be darned well well I’ll be darned is is just as German speaker it just feels uh more accurate and also you don’t get this long answer you get this short concise answer so look these are two simple little use cases that are completely unrelated to science coding and math.

Practical Applications: How to Utilize the AI

Which is what they advertise it as that they say this model is for for advanced reasoning and it will be useful for science coding and math but I would challenge that a little bit because these are just my first impressions here okay but I also did a thing here in the background where I actually went ahead and I ran the same prompt twice and this this this would be my argument here why I think we’re going to find over the next days and weeks that this model is way more useful than what it might seem like initially cuz it’s interesting.
I don’t think we’ve seen any like game changing use cases as of yet but I think with prompts into this direction I look I carefully compared the results it’s not night and day but there is a difference um so create a business plan for a new t-shirt brand with the name and slogan Where is the voice assistant with a $2,000 budget GPT 40 on the left gpt1 preview on the right okay there you go now this is the part that I was curious about because this is so good at math it’s good at money and being good
with money is sort of a useful thing to people outside of the Sciences right or coding or math so I gave it this prompt and let me tell you I was actually extremely surprised by 40 they must have changed something under the hood cuz look in it includes a table it does the calculations Right This is 40 this is not 40 preview the new one right this is the the original the model we had but some something must have changed under the hood because let me tell you a few months ago this result would have not been accurate it it does.

The math
correctly here this adds up to 2K then it gives you a budget and cost estimates for everything and in the end if you scroll all the way down I’m sorry if I’m scrolling a bit too fast here um but I believe I had a flup prompt here yeah it gives you a conclusion and it’s decent it’s good now look when I ran this through the new model it’s slightly better I would say it’s slightly better it gives me different um no this was actually my followup prompt apologies for that but it is slightly better at the estimations
at the calculations it is better but 40 was precise and that was the surprising part to me um so yeah if I just look at these conclusions I think this is more accurate um and I would consider this higher quality myself also the overview and all the processing is structured better I really dislike the fact that here it really gives you um too much detail in the in the various steps and here it kind of breaks down what you need and then I followed up with this prompt over here check this out what is the optimal level of spend to launch.

This brand okay it’s kind of a tricky follow-up prompt because like what do I mean by Optimal um and what does it take to launch this brand it has to figure all that out well this model fought for 10 seconds and then it gave me an answer and 40 just gave me an answer right away and it kind of landed on U you know you need 5 to 10K uh would be the sweet spot but that is sort of an estimate I kind of like this conclusion over here more that gives me hey these are your free options and uh here’s a summary of why
that would be good I just I just personally think this is a higher quality answer so I think my my first intuition and I will test this more extensively in report back but my intuition is that whenever is involved this is going to be more useful and that’s I don’t think that’s a that’s a trivial conclusion I don’t think that’s a you know little prompting trick that you might or might not use like heck whenever you need to do Financial calculations just use this new model and let it think about it for a second.

Getting Started: A Step-by-Step Guide

Whenever you do something more complex like a marketing plan like a business plan maybe something that would take a little more thinking think about it in the way I presented at the beginning if a human would need to think more than 3 seconds about it or 10 seconds then your AI assistant most likely will too and I have one more thing I could go into many interesting kind of like facts and and and uh many interesting points from their presentation here I think particularly I think particularly this one video over here was very interesting.

I would recommend you check this out uh from open AI this is uh they basically talk about how they built it and there was this one interesting fact where they tried to map human thinking and and then use the ways human think inside of the model and then they tried letting AI do the same thing and that worked way better so I I thought that was just interesting but beyond that what’s the point here and what’s the conclusion here so look I think for everyday users is this going to be useful we shall see
I’m I’m not exactly sure um I think I see a slight Improvement in these money related prompts but let me tell you in my everyday task even with with chat GPT I don’t use many science or math related things I do use a bit of coding but if if you don’t have coding skills then you know the C improved coding ability of this which now supposedly is better than Sonet 3.
5 and and um which was state-of-the-art right now for code generation so so for code generation apparently this is Best in Class I’ll have to do some further testing to confirm that but if you’re not my point is if you’re not using one of these um one of these if you don’t work in in science or math or or code generation then it’s questionable if this will be useful but beyond just thinking about this in a way where you know do I need more than a few seconds to answer this then I use the model or do would a human
need more a few seconds.

I need the model I would also bring up a second point and I would like to round it out on this and a few prompting tips to it that I found well upon on with chat GPT you really had an AI assistant you threw it a task and the AI assistant did what it could it presented you with the findings and if it didn’t have some context because it didn’t provide it or because it’s not available in the training data it just hallucinated it and it just did everything it could to give you an
answer well with gpt1 preview it’s more like you have multiple AI assistants you spawn three of them and they talk to each other before they talk back at you okay so originally you talking to an assistant as doing.

What here or she can and replying now you telling what you want and multiple assistants showing up behind you them talking to each other figuring it out that’s what this thinking would be that’s what this that’s what these multiple steps would be um and then they reply to you look at
that 31 seconds of thinking of how to rearrange this palindrome to make sense okay so I think that’s a really good way of thinking about it and I’ll just leave you with some final prompting tips that were discovered by cognition Labs over here okay so cognition U the Builder of Devon you might be familiar this was the first kind of like AI agent preview we still don’t don’t have it but they’re saying with this model it works way better than anything before so that’s going to be interesting.

They have
some prompting tips here so first of all never again tell this model to think step by step or think out loud it works better if you don’t okay so that’s one thing because the model is trained on this it’s it’s uh set up to behave that way your AI assistants are set up to talk to each other secondly um if you have more clut in your prompt it will perform worse so traditional prompting as they say here often even in in you know Sam The Prompt Creator the free tool that I made accessible and built with.

Ethical Considerations and Future Implications

The I
Advantage community that thing really fleshes out the prompts and re refreshes certain instructions and gives them multiple times and reinforces them because that’s how traditional prompting works this prompts a little differently less is more in the past I referred to this as goal-based prompting rather than you know fleshing out all of the context whereas goal Bas prompting is better for agents and this is this agentic direction so it’s better for this what’s the difference between normal prompting
and goal-based prompting well normal prompting would be like write me an like you are an expert in you know children’s literature write me a children’s book about uh you know topic XYZ with this turn and this story arc um and you know make it you know 10 pages whatever I just invented all that that would be normal prompting second one would be write the best children’s book possible for specific target audience and then the multiple agents the multiple assistants they figure it out amongst themselves
it’s a different approach to prompting and as you can see the second prompt is way shorter and this has been the case with all a agentic experiences so far uh experiments um and it seems to Prevail so keep your prompts shorter and simpler and I would say uh prompt based on goal rather than based on uh every single little detail you can always add the details in follow up prompts I mean you know too bad that we only get 30 messages per week but that’s why we have channels like this um we’ll be experimenting with the API I’m reporting
back but one more thing is that um it doesn’t have tools that’s the last thing I wanted to point out in this video okay but it’s on the road map so it doesn’t have code interpreter it doesn’t have web browsing it doesn’t have image generation it doesn’t have image upload none of these but they did mention that it’s coming along with eventually cat GPT just Auto selecting the models and the tools and everything so right now you have to watch videos like this to have the information of hey
should I be using 40 should I be using 40 mini should I be using uh 01 should I be using a competitors’ s model in the future it’s going to figure it out on its own you’re going to give it a goal and it’s going to look at all the tools it’s going to have full understanding and it’s just going to make the decisions hey use this over here use that here’s your result user that’s it that’s a future of this technology this is a major step towards it check out the different videos on open as channel for
example and follow this channel hit subscribe to hear more about this and how to use it in the real world because I think there’s some hidden there’s some hidden there must be some hidden gems in here and I’ll do my best to discover them all right I’ll see you soon.

