ai
Resources
Large Language Model (LLM)
https://www.youtube.com/watch?v=zjkBMFhNj_g
Two files - llama-2-70b model
parameters
run.c
file
- every parameter is stored as two bytes (float 16 number as the data type)
- the run file can be in any language. in this case, about 500 lines of c code to run the model
- you can run this locally. you compile the c code, and then point the c binary at the parameters and talk to it
Model training is a lot more intensive than model inference
Model training is like the compression of the internet.
Cost about 2 million dollars, and about 12 days to train the 70b model
This is the lossy compression of the internet. We don't have an identical copy of it.
Neural network - Predict the next word in the sequence
next word prediction
the network dreams internet documents
neural network built by multiple steps of optimization
can measure the output in different situations
The assistant model
- train on internet document
- train on dataset that we collect manually
- hire people, give labeling instructions, come up with questions and write answers for them
- finetuning
Fine tuning is a lot cheaper
What accuracy are you going to achieve?
Multimodalities
Thinking fast vs thinking slow: instinctive vs slower
GenAI is a system 1 -- words enter in a sequence
AlphaGo
Custom LLMs
Augmented generation
Thinking for a long time using system 2
Kernel - an operating system
Jailbreak attacks
Base64 jailbreak
The LLM is fluent in Base64
Opensource LLM
https://github.com/ollama/ollama
https://github.com/facebookresearch/llama