Nat’s 2022 Technical Link Pile: GPT-3
December 30, 2022 – 7:16 pmSee the Intro for context. (Also, this was a late addition and I’m focused in my interest, so this is thin)
[20221231] Reverse Engineering Notion’s Prompts — in Hacker News discussion there’s skepticism you’re getting the actual prompts, vs GPT-3 being its usual stylistically faithful yet factually inaccurate creative self. Someone from Notion confirmed this.
[20221231] Some Short-term Predictions — the age of AI, if you don’t own the query interface, you’re just assembling training data for those who do. (The behaviour of the social media tech powers mean I do not trust successful AI companies to use their powers wisely)
[20221221] On Meaning, Form, and Understanding in the Age of Data — In this position paper, we argue that a system trained only on form has a priori no way to learn meaning.
[20221221] On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? — In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.
[20221221] ChatGPT Playing a Complex Game — basically a German board game with different resources and rates of conversion.
[20221221] Prompt Engineering Guide — pointers to papers and tutorials on prompt engineering.
[20221221] Prompt Engineering Guide from Microsoft — give it a high-level task description, guide it with examples, guide it with high-level contextual information (eg API in comments), guide it with conversational history.
[20221221] Best Practices for Prompt Engineering with the OpenAI API — Put instructions at the beginning of the prompt and use ### or “”” to separate the instruction and context; Be specific, descriptive and as detailed as possible about the desired context, outcome, length, format, style, etc; Articulate the desired output format through examples; Articulate the desired output format through examples; Reduce “fluffy” and imprecise descriptions; Instead of just saying what not to do, say what to do instead; Code Generation Specific – Use “leading words” to nudge the model toward a particular pattern (eg you leave SELECT in as the last word of your prompt to build a SQL query).
[20221221] GPT-3 Lacks Systematicity — asking it to reformat dates, sometimes it gets days and months around the wrong way.
[20221221] Tracing Emergent Abilities of Language Models to Their Sources — the initial GPT-3 might be superficially weak, it turns out later that these abilities serve as very important foundations of all the emergent abilities unlocked later by training on code, instruction tuning, and reinforcement learning with human feedback (RLHF). […] The ability to respond to human instructions is a direct product of instruction tuning. The ability of generalization to unseen instructions is a free lunch given by scaling types of instructions. The ability of complex reasoning with chain-of-thought is likely to be a magical side product of training on code.
[20221221] The GPT-3 Architecture on a Napkin — exactly what I needed.
[20221221] Azure Cognitive Service for Language (Pricing) — ouch.
[20221221] Finetuning with the OpenAI Language API — giving training examples on the commandline.
You must be logged in to post a comment.