EDIT: The syntax came from a language proposal in a github issue from 8 years ago, so I guess it's not fully hallucinated. But still not the best choice of what source to use.
P.S. Agent Mode is a superior option to Fast Mode. It meticulously examines your questions and assigns an appropriate agent to provide answers, leveraging GPT-4 technology in its operations.
---
> The syntax came from a language proposal in a github issue from 8 years ago, so I guess it's not fully hallucinated. But still not the best choice of what source to use.
Yes, if the source contains errors, the response may generate inaccurate information. We are continually refining our re-ranking algorithm within the Retrieval-Augmented Generation (RAG) system to select the most reliable sources.
I don't really get Perplexity - it is amazingly slick, but I get almost line-by-line identical output from Bing Chat, so I have to wonder how much differentiation they really afford (I haven't set up an account, just comparing free access). This, though, has mostly gotten what I asked it right (including some arcane C++ stuff), so I will be giving it a try at home.
It's not very good at giving the proper credence to version numbers.
Granted I started with a hard one, but I asked it how to create a GTK3 interface with PHP, and it gave me instructions to download and use an abandoned project for GTK2, but described it as GTK3 in the steps.
I tried asking it some other questions about languages and applications specific to version numbers - it seems to provide incredibly ambiguous and version agnostic responses, or tells me essentially "you may or may not be able to do this, and you should check if you can" when the answer is clearly that it is not possible. Or it just ignores the version entirely and provides instructions that don't match up - hallucinating UI elements or commands that don't (or didn't yet) exist.
For something targeted at developers, this is a gaping hole and is what I would consider a major oversight - the responses I'm getting are very similar in content to what I get from GPT and Ollama's generic models.
That's kind of an interesting issue, I wonder if different tokenization would help. Like maybe putting a space between GTK and the number would put them in separate tokens and give better output.
More generally, do text AI's not support weighting terms like the image AI's do? Over in Stable Diffusion that sounds like something where I'd add a weight like "How do I create a <GTK3:1.2> interface in <PHP:1.1>?"
It is quite possible that the lack of actual intelligence in the LLM is the obstacle in this context.
I also just queried something with "perplexing" results in fact, but I tried the "generic" "knowledge" instead of the "specific" about coding: in the reply the engine included good pointers, but clearly without knowing why they were especially relevant - relevance which instead appeared in the linked references.
It is an LLM+RAG based search engine: the value is only partly in the summary, which could even be misleading - as expected from lack of actual intelligence -, the value is in the linked resources.
In other words, it "understands" your query better that a search engine of the past - and that is valuable. But for the actual solution you are querying for, the "summary" part could be good or could be defective: it is probably best to consult the linked material... Material that you could have not found immediately otherwise - it could have been tricky with past technology to express your need in a way that makes you obtain good search results.
Interesting take! At face value, I would say that if this is the intended usage proposition, the summary actually adds negative value and should not exist.
Or perhaps a more brief summary for each result explaining the relation?
Have you tried Agent Mode? It offers greater intelligence and accuracy compared to Fast Mode.
P.S. Agent Mode is a superior option to Fast Mode. It meticulously examines your questions and assigns an appropriate agent to provide answers, leveraging GPT-4 technology in its operations.
P.S. Agent Mode is a superior option to Fast Mode. It meticulously examines your questions and assigns an appropriate agent to provide answers, leveraging GPT-4 technology in its operations.
Have you thought about accepting the query as a get request?
The 3 engines you mention (Perplexity, You.com and Phind) all do that. So do Google, Bing and DuckDuckGo. It makes it easier to link to results and build custom links.
FYI, I can't view your terms because it claims my browser is incompatible. The website itself (devv), HN, OpenGL applications, youtube (JS-heavy), everything works fine but the plain text that your ToS and privacy need to be give that error message with no further information that I could pass on to debug it
In case anyone knows, I'd be curious: does that mean no terms apply to my usage if I can't view them by reasonable means? Just whatever local law defaults apply? Earlier today I noticed the terms of the local zoo 404'd (while buying tickets online) and I wondered the same
Users will likely find the positioning or, let's say, mental mapping effective. Perplexity serves as an anchor for general searches, triggering thoughts like, "Hey, I want to search for something." However, if you're a coder, it might specifically prompt, "Hey, I want to search for something related to code."
Take, for example, several GPT-wrapped products like Monica.im. While Monica offers more convenience, I still find myself sticking with ChatGPT to get my tasks done. There’s something to be said for the power of habit!
Ultimately, what matters is whether your service can deliver superior search results.
Consider Devv, which has crafted a specialized search mode for Github composites. It's uncertain if Perplexity will follow this path. Devv aims to cater to all code-related searches, continually refining its outputs and taking extra care to prevent bad cases.
Vertical and general are two sides of the same coin.
Your implementation strategy sounds interesting! I'll give it a try. While reading your design it made me interested if i as a user could prompt new indexes for libaries i use.
Ie if the quality RAG index is your primary offering, then as a user i imagine my experience will depend on how well you have indexed things i care about. Maybe my language of choice (Rust) has decent indexes, but some random Crate i try to use might not.
I'd love to be able to queue up index ingests of standard API sources like docs.rs/crates.io and be notified when that ingest completes.
Thank you for your valuable feedback; it's an excellent suggestion! In fact, we've already begun implementing this feature with our initial step being the introduction of GitHub Mode. This new functionality will enable seamless integration with your personal GitHub repositories. We've developed a bespoke indexer tailored to various programming languages to enhance this experience.
Furthermore, we can expand this capability to include documentation and other resources as well. The architecture is designed to be extensible, so all that's needed is the creation of additional indexers to support these materials.
This is great. I'd love to see a higher level architectural writeup/talk (but not stack specific) about how to build a live search RAG system like this, perplexity, etc.
Is there something like this (maybe this?) that provides an API so I can integrate it like any other model into my own website (in this case, https://cocalc.com)? I tried asking the Phind.com devs, but got ignored.
I would also love an API like this for integration with Plandex[1] (a terminal-based coding agent for complex tasks). Perplexity has an API but it only exposes various open source LLMs, not the search-enriched results from their main product.
It would be really cool if, when starting a coding task with Plandex, relevant docs/context from a web search could be automatically included in context via this kind of API. Currently urls can be loaded into context with `plandex load [url]` but you have to figure out which urls would be helpful to load yourself.
Great, we're on it. We'll be looking into this project and keeping you updated at our changelog hub: https://hub.devv.ai/changelog. As soon as our API goes live, we'll post the announcement there.
That sounds interesting. Could you provide further details? By the way, integrating an API is part of our future plans. We plan to enable Devv integration with Slack, Linear, and websites in the future.
Also, if you want to discuss more, feel free to email me at jiayuan@devv.ai
Yes, this is on our roadmap. We will launch "Devv for Teams" in the upcoming quarter. This new feature will enable seamless integration of internal team knowledge, including codebases, wikis, issue trackers, and logs.
If self hosted Devv for Teams supports BitBucket, Confluence, JIRA and Azure DevOps, the company I work for (v large enterprise) would be incredibly interested.
It won't be worth looking into, because there's nothing they (devv.ai) can do, short of trying some automated self-improvement loop a la devin where the AI writes code, evaluates the code, fixes issues as they arise... Still not worth, it's not their core business.
You're just hitting a limit of the LLMs, they won't give you bug-free code, specially not from the first time, specially not complex ones like galloping timsort.
"For complex queries where Devv Agent infers your question before selecting appropriate solutions."
Could you expand on this a bit? What does "infers your question" mean?
It's not all that clear to me from the site or your post when Fast Mode vs. Agent Mode should be used. Is Fast Mode for answering conversational questions and Agent Mode for answers that involve writing code?
Oh wow. This is quite decent. I asked it two questions that has historically tripped up either Google Gemini (what does an asterisks in the middle of a parameters list mean in Python) or ChatGPT (how to extend Fernet to use AES 256) and it got both of them right.
I am also pleasantly surprised it is not suffering a "hug of death" following the presentation here. I am curious about the need in resources for your engine? What kind of hardware is it running on?
Just curious looking at this, can any decent programmer build at least an extremely simple version of this? Considering whether it would be cool as a summer project.
Creating a simple generative search engine is straightforward and can be accomplished over a weekend.
Essential components include:
- A search engine API (such as Bing or Google's)
- Integration of search engine results with a Large Language Model (LLM)
This framework, known as Retrieval-Augmented Generation (RAG), was the foundation for the initial version of Perplexity.
The challenging aspect lies in refining the generation outcomes, which involves more proprietary techniques.
https://devv.ai/search?threadId=dl3rtxmcsruo
EDIT: The syntax came from a language proposal in a github issue from 8 years ago, so I guess it's not fully hallucinated. But still not the best choice of what source to use.
I utilized Agent Mode to rephrase the query, and here are the results: https://devv.ai/search?threadId=dl3rtxmcsruo
P.S. Agent Mode is a superior option to Fast Mode. It meticulously examines your questions and assigns an appropriate agent to provide answers, leveraging GPT-4 technology in its operations.
---
> The syntax came from a language proposal in a github issue from 8 years ago, so I guess it's not fully hallucinated. But still not the best choice of what source to use.
Yes, if the source contains errors, the response may generate inaccurate information. We are continually refining our re-ranking algorithm within the Retrieval-Augmented Generation (RAG) system to select the most reliable sources.
Outside of popular languages it seems like they always hallucinate.
I had an issue with shopify and was able to work through the fix using perplexity which I wasn't able to get with chatgpt on its own.
I love that you can change the models, I mostly use Claude opus though.
I do wish the image generator was better but they frame perplexity as a search engine rather than chat so I have firefly if I really need an image.
Try searching for "Weather in [your city]" and compare it to Google or any weather app. It's consistently wrong.
Gonna add some free models with search in future
Granted I started with a hard one, but I asked it how to create a GTK3 interface with PHP, and it gave me instructions to download and use an abandoned project for GTK2, but described it as GTK3 in the steps.
I tried asking it some other questions about languages and applications specific to version numbers - it seems to provide incredibly ambiguous and version agnostic responses, or tells me essentially "you may or may not be able to do this, and you should check if you can" when the answer is clearly that it is not possible. Or it just ignores the version entirely and provides instructions that don't match up - hallucinating UI elements or commands that don't (or didn't yet) exist.
For something targeted at developers, this is a gaping hole and is what I would consider a major oversight - the responses I'm getting are very similar in content to what I get from GPT and Ollama's generic models.
More generally, do text AI's not support weighting terms like the image AI's do? Over in Stable Diffusion that sounds like something where I'd add a weight like "How do I create a <GTK3:1.2> interface in <PHP:1.1>?"
I also just queried something with "perplexing" results in fact, but I tried the "generic" "knowledge" instead of the "specific" about coding: in the reply the engine included good pointers, but clearly without knowing why they were especially relevant - relevance which instead appeared in the linked references.
It is an LLM+RAG based search engine: the value is only partly in the summary, which could even be misleading - as expected from lack of actual intelligence -, the value is in the linked resources.
In other words, it "understands" your query better that a search engine of the past - and that is valuable. But for the actual solution you are querying for, the "summary" part could be good or could be defective: it is probably best to consult the linked material... Material that you could have not found immediately otherwise - it could have been tricky with past technology to express your need in a way that makes you obtain good search results.
Or perhaps a more brief summary for each result explaining the relation?
P.S. Agent Mode is a superior option to Fast Mode. It meticulously examines your questions and assigns an appropriate agent to provide answers, leveraging GPT-4 technology in its operations.
I just tested it by typing "llama cpp gpu support" that's it.
Flawless instructions for Python, but when I followed up with
"in node"
It didn't know about node-llama-cpp. Is there a general knowledge cutoff, and/or is loading developer-specific stuff a manual process?
The results: https://devv.ai/search?threadId=dl3vwbdu52ww
P.S. Agent Mode is a superior option to Fast Mode. It meticulously examines your questions and assigns an appropriate agent to provide answers, leveraging GPT-4 technology in its operations.
So agent mode is better for more recent stuff that you might find in a search engine?
The 3 engines you mention (Perplexity, You.com and Phind) all do that. So do Google, Bing and DuckDuckGo. It makes it easier to link to results and build custom links.
Also, I could add you to Gnod Search then:
https://www.gnod.com/search/ai
https://www.gnod.com/search/ai?q=Python%3A%20How%20do%20I%20...
In case anyone knows, I'd be curious: does that mean no terms apply to my usage if I can't view them by reasonable means? Just whatever local law defaults apply? Earlier today I noticed the terms of the local zoo 404'd (while buying tickets online) and I wondered the same
You can view the terms here: https://indexlabs.notion.site/Term-of-Service-6ca77cbc49504c...
Users will likely find the positioning or, let's say, mental mapping effective. Perplexity serves as an anchor for general searches, triggering thoughts like, "Hey, I want to search for something." However, if you're a coder, it might specifically prompt, "Hey, I want to search for something related to code."
Take, for example, several GPT-wrapped products like Monica.im. While Monica offers more convenience, I still find myself sticking with ChatGPT to get my tasks done. There’s something to be said for the power of habit!
Ultimately, what matters is whether your service can deliver superior search results.
Consider Devv, which has crafted a specialized search mode for Github composites. It's uncertain if Perplexity will follow this path. Devv aims to cater to all code-related searches, continually refining its outputs and taking extra care to prevent bad cases.
Vertical and general are two sides of the same coin.
Ie if the quality RAG index is your primary offering, then as a user i imagine my experience will depend on how well you have indexed things i care about. Maybe my language of choice (Rust) has decent indexes, but some random Crate i try to use might not.
I'd love to be able to queue up index ingests of standard API sources like docs.rs/crates.io and be notified when that ingest completes.
Will give it a try today, congrats on the launch!
Thank you for your valuable feedback; it's an excellent suggestion! In fact, we've already begun implementing this feature with our initial step being the introduction of GitHub Mode. This new functionality will enable seamless integration with your personal GitHub repositories. We've developed a bespoke indexer tailored to various programming languages to enhance this experience.
Furthermore, we can expand this capability to include documentation and other resources as well. The architecture is designed to be extensible, so all that's needed is the creation of additional indexers to support these materials.
It would be really cool if, when starting a coding task with Plandex, relevant docs/context from a web search could be automatically included in context via this kind of API. Currently urls can be loaded into context with `plandex load [url]` but you have to figure out which urls would be helpful to load yourself.
1 - https://github.com/plandex-ai/plandex
I dont see a link to it on the github/website. DO i have to self host right now ?
Good luck with this
That sounds interesting. Could you provide further details? By the way, integrating an API is part of our future plans. We plan to enable Devv integration with Slack, Linear, and websites in the future.
Also, if you want to discuss more, feel free to email me at jiayuan@devv.ai
Does this mean you intend to let people self-host?
Yes, this is on our roadmap. We will launch "Devv for Teams" in the upcoming quarter. This new feature will enable seamless integration of internal team knowledge, including codebases, wikis, issue trackers, and logs.
I really distrust putting my API keys into brand new and unknown websites, just seems like credentials harvesting to me.
You might want to check out this project.
I'm running the code it gave me to try it out on a small list, it's been 10 minutes and it's still running. Might be something worth looking into.
Granted, the way I asked for this function was not the most natural.
[1] https://devv.ai/search?threadId=dl4c8if11c00
You're just hitting a limit of the LLMs, they won't give you bug-free code, specially not from the first time, specially not complex ones like galloping timsort.
"For complex queries where Devv Agent infers your question before selecting appropriate solutions."
Could you expand on this a bit? What does "infers your question" mean?
It's not all that clear to me from the site or your post when Fast Mode vs. Agent Mode should be used. Is Fast Mode for answering conversational questions and Agent Mode for answers that involve writing code?
Feedback: I tried to click one of the links under "source" but it kept jumping down as the LLM-generated content was added.
I am also pleasantly surprised it is not suffering a "hug of death" following the presentation here. I am curious about the need in resources for your engine? What kind of hardware is it running on?
backend: go/rust/python + gin + mysql + pinecone + es + redis + aws
llm: openai/azure + aws gpu + aws bedrock
Essential components include: - A search engine API (such as Bing or Google's) - Integration of search engine results with a Large Language Model (LLM)
This framework, known as Retrieval-Augmented Generation (RAG), was the foundation for the initial version of Perplexity.
The challenging aspect lies in refining the generation outcomes, which involves more proprietary techniques.
Looks like there's an opportunity to improve the fast mode by caching the results for simple searches.
I've outlined some initial ideas in this post and may develop a more detailed article later on. Stay tuned!