Highlights

Quotes, notes, and insights extracted from posts.

The core of RAG is a general-purpose fine-tuning approach where both the retriever and the generator are trained jointly and end-to-end on downstream NLP tasks. This means that theparameters of the retriever (specifically the query encoder) and the generator are adjusted based on the task-specific data
In intelligence and artificial intelligence, an intelligent agent (IA) is an agent acting in anintelligent manner. It perceives its environment, takes actions autonomously in order to achieve goals, and may improve its performance with learning or acquiring knowledge. An intelligent agent may be simple or complex: A thermostat or other control system is considered an example of an intelligent agent, as is a human being, as is any system that meets the definition, such as a firm, a state), or a biome.[[1]](https://en.wikipedia.org/wiki/Intelligentagent#citenote-FOOTNOTERussellNorvig2003chpt.2-1)
While compound AI systems can offer clear benefits, the art of designing, optimizing, and operating them is still emerging. On the surface, an AI system is a combination of traditional software and AI models, but there are many interesting design questions. For example, should the overall “control logic” be written in traditional code (e.g., Python code that calls an LLM), or should it be driven by an AI model (e.g. LLM agents that call external tools)? Likewise, in a compound system, where should a developer invest resources—for example, in a RAG pipeline, is it better to spend more FLOPS on the retriever or the LLM, or even to call an LLM multiple times?
With the benefit of hindsight, the field of artificial intelligence stems from the research originally done on Hopfield nets, Boltzmann machines, the backprop algorithm, and reinforcement learning. However, the evolution of backprop networks into deep learning networks had to wait for three related developments:1) much faster computers, 2) massively bigger training data sets, and, 3) incremental improvements in learning algorithms ~ from A Very Short History of Artificial Neural Networks | by James V Stone
“It may be speculated that a large part of human thought consists of manipulating words according to rules of reasoning and rules of conjecture.” “How can a set of (hypothetical) neurons be arranged so as to form concepts? Considerable theoretical and experimental work has been done on this problem by Uttley, Rashevsky and his group, Farley and Clark, Pitts and McCulloch, Minsky, Rochester and Holland, and others. Partial results have been obtained, but the problem needs more theoretical work.”
Note: As I was writing this, I wrote way more sections than I should have, so I decided to break it down into two parts (it will not likely fit an E-mail anyway). This here is part I, in part II, we will compare and contrast history and present, look at AI and innovations in general from the business and economic perspective, and brainstorm on how to build better products catering to a larger group of consumers from the lessons learned in part I.
We propose that a 2 month, 10 man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire.The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it ~ from A proposal for the Dartmouth Summer Research Project on Artificial Intelligence
A comparison of training throughput (tokens per second) for the 7B model with a context length of 512 on a p4de.24xlarge node. The lower memory footprint of LoRA allows for substantially larger batch sizes, resulting in an approximate 30% boost in throughput. ~Fine-Tuning LLMs: LoRA or Full-Parameter? An in-depth Analysis with Llama 2
Pre-training refers to the process of initializing a model with pre-existing knowledge before fine-tuning it on specific tasks or datasets. In the context of AI, pre-training involves leveraging large-scale datasets to train a model on general tasks, enabling it to capture essential features and patterns across various domains. ~Lark
Google+ is a prime example of our complete failure to understand platforms from the very highest levels of executive leadership (hi Larry, Sergey, Eric, Vic, howdy howdy) down to the very lowest leaf workers (hey yo). We all don't get it. The Golden Rule of platforms is that you Eat Your Own Dogfood. The Google+ platform is a pathetic afterthought.We had no API at all at launch, and last I checked, we had one measly API call. One of the team members marched in and told me about it when they launched, and I asked: "So is it the Stalker API?" She got all glum and said "Yeah." I mean, I was joking, but no... the only API call we offer is to get someone's stream. So I guess the joke was on me.
To reduce the cognitive load of developers, the Platform teamshould cover the entire tech stack: Infrastructure/DevOps/SRE, but also frontend, backend, and security topics. ~ from Platform Engineering, Part 2: WHAT Are The Goals of a Platform Engineering Team? | by Benoit Hediard | Stories by Agorapulse | Medium