Orchestration
Jan 19, 2024

Haystack. The proverbial needle in a stack of LLM orchestrators.

All you could ever need or want in one LLM orchestrator

Image Alt Text

The problem

If you are like many of the enterprises, sole proprietors, and individuals out there looking to harness generative AI for your business use cases, you have probably been faced with choosing what framework to choose for your LLM application.

Unfortunately this is becoming increasingly difficult by the day with the amount of new libraries and frameworks being released around LLMs and building with them. Some of the standout libraries you have most likely heard of are Langchain, LlamaIndex, and my favorite, Haystack.

The thing that drove me to haystack was my search for a library that would allow me to not only ingest, preprocess and index my data, but also be able to filter searched context using metadata fields I created at index time. In my opinion nobody does this better than haystack. Metadata filtering was what initially attracted me, but the mass of additional features I found are what has kept me!

What is Haystack?

Haystack is an end to end framework for building and working with LLMs. What this means is that everything from document processing to formatting the LLMs answer to your question properly can be done with haystack. That may not sound entirely different from the aforementioned libraries to you, but I can try to help set them apart with several important points, and then you can decide for yourself.

Documentation

The first and easiest difference to distinguish is the high quality documentation available from haystack. The developers were sure to include thorough information, easy to find components, and in depth examples of how to implement each component even with different parameters. This is the most important point because how does one even go about using a framework to its full potential without proper documentation.

Pipelines

As you begin to build your application, you may find that you will need to perform certain steps over and over again. This is where haystack’s ready-made pipelines come in to play. The have several common pipelines that are easy to deploy for the purpose of document preprocessing, chunking, embedding and storing as well as simply querying and several other use cases.

Custom Nodes

Pipelines are a very powerful and time saving feature, but if we think back to my point about this ever changing field you may imagine that pre-defined pipelines could lock you into certain dependencies, but this is not at all the case. That brings me to my next point, custom pipeline nodes. Within the haystack documentation, you will find a very user friendly how to on way to construct custom pipeline nodes to be used with the ready-made pipelines already at your disposal. If you have discovered a new and more powerful PDF converter that you would like to use but don’t want to wait for it to be added to the repo, simply build out a custom node according to your needs and add it to the pipeline. This ensures you are never limited while using haystack.

                  
                    from haystack.nodes.base import BaseComponent
                
                    class NodeTemplate(BaseComponent):
                        # If it's not a decision component, there is only one outgoing edge
                        outgoing_edges = 1
                
                        def run(self, query: str, my_arg: Optional[int] = 10):
                            # Insert code here to manipulate the input and produce an output dictionary
                            ...
                            output={
                                "documents": ...,
                                "_debug": {"anything": "you want"
                            }
                            return output, "output_1"
                
                        def run_batch(self, queries: List[str], my_arg: Optional[int] = 10):
                            # Insert code here to manipulate the input and produce an output dictionary
                            ...
                            output={
                                "documents": ...,
                            }
                            return output, "output_1"
                  
                

Continuity and Control

User control. It is easy to see how this was kept in mind through every step of writing the haystack library. Other frameworks will force you to use app conversation memory in an abstracted way that you cannot change, documents will be processed and returned in a random fashion(i.e. plaintext/sometimes structured according to the document loader you use) rather than a unified Document schema, or lock you in to using one type of FAISS index when the library is capable of much more. You won’t find this with haystack. Everything from metadata, custom node creation, preprocessor parameters, prompt templating, all this behavior and fine grained control at your fingertips.

Conclusion

In conclusion, haystack allows its users a production ready, easy to use framework that covers just about all of your needs, and allows you to write integrations easily for those it doesn’t. Haystack goes to work for you, rather than you having to tweak your goals, use cases, or deployment plans in order to keep the other framework you may have chosen.

Subscribe to our newsletter today

Lorem ipsum dolor sit amet consectetur in quisque varius eget turpis sollicitudin purus arcu morbi lorem lacus sit in tellus dolor eget.

Thanks for joining our newsletter.
Oops! Something went wrong.
Subscribe To Our Newsletter Today Image - GenerativeAI X Webflow Template

Related articles

No items found.