Which AI models to use in business?
If you try to categorize the models by one criterion it’s basically a simple question. On one side there are proprietary models such as GPT, Claude, Gemini. In the other camp are open source models such as Mistral, Llama, Gemma and others.
I think will be great briefly reviewing the pros and cons of both approaches to better understand what is appropriate in any given situation.
Proprietary - Pros:
- Significantly more powerful than open source models across a wide range of capabilities, including multimodality and a much wider context window, number of input and output tokens.
- You get ready to use solution, no need to deploy and maintain your own infrastructure.
- Regularity of updates to model features. Due to the capabilities of the technology you can solve a huge layer of technical issues that you do not have to solve on your side.
Proprietary - Cons:
- You give your data to a third party. So you are training a model that is not your own and making it smarter. In the case of developing a unique solution this may not be the best way to go. Especially when it comes to fine-tuning and serious amounts of data.
- Providers charge for using the API and sometimes it can be substantial which you have to build into your economics as a separate cost category.
Open source - Pros:
- Сontrol your entire pipelines.
- You don’t send your data to a third party, it stays with you.
- More flexible options for additional training, fine-tuning and other ways to improve your model.
- Don’t pay for the solution itself, it’s free and open source.
- There are large communities that experiment a lot and often produce interesting solutions.
Open source - Cons:
- Opensource model are still not powerful enough and not smart enough compared to proprietary ones.
- Sometimes require serious customization for your tasks, which takes time for research and development.
- Multimodality is currently very underdeveloped and severely limited in the number of tokens.
- You need to deploy and maintain your infrastructure.
- This gives a general understanding of the situation. Of course this is far from everything, but just the tip of the iceberg.
For me, there is still no clear answer to the question of which is better. Personally, I support combinations of both approaches to take the best of both worlds for different business tasks:)