The (ab)use of Open Source Code to Train Large Language Models: In recent years, Large Language Models () have gained significant popularity due to their ability to generate human-like text and their potential applications in various fields, such as…
Posted 1 year ago (02 March 2023)
srush_nlp Microsoft semantic parsing people have developed some nice tools for this: Their code lets you intersect beam search with a CFG to only generate valid programs. Works on (albeit slowly/expensively unless you are OpenAI), works very well for HF models
Posted 1 year ago (01 March 2023)
With Multimodal-CoT, the Amazon 1B model outperforms the previous state-of-the-art #LLM (GPT-3.5) (75.17%→91.68% accuracy) and surpasses human performance on the ScienceQA benchmark Code is publicly available #AI #HPC via harmlessai adriano_galano
Posted 1 year ago (23 February 2023)
LLMs & Code Edit Tuning -Performance-Improving Edits () curated from users starting w/ slow code & speeding it up -Use to: fine-tune open-source CodeGen few-shot prompt CodeX -Speeds up code 2.5x -10x smaller CodeGen matches perf of CodeX Paper
Posted 1 year ago (19 February 2023)
ilyasergey Babble at stuck out to me. They use e-graphs for code size compression with fantastic results. Their approach is language independent and I think could be applicable for a lot more (I’m thinking performance and compiler optimizations…)
Posted 1 year ago (09 February 2023)
The BLEU score is fundamentally flawed yet is still used in a majority of machine translation papers! My fave example is from the eval of Codex "" translations - solutions did not overlap much with "ground truth" but still passed unit tests!
Posted 1 year ago (18 December 2022)
ArXiv : Investigates scaling laws for contrastive language-image pre-training () with the public LAION dataset and the open-source Open repository. Training distribution plays a key role in scaling laws. Source code and instructions are available.
Posted 1 year ago (15 December 2022)
MichaelKGoff 1. Continuing "what does it mean to understand language" where Winograd left, but with the tools from 2022 2. Generating code (vs. parroting code) 3. fchollet 's ARC challenge, where deep learning does **** work. Taking language somewhat seriously, ..
Posted 1 year ago (18 November 2022)
Can we use tools from quantum error correction to add some (partially fault-tolerant) protection to quantum circuits on near-term hardware? In we demonstrate progress using the [[k+2, k, 2]] error detection code on Quantinuum’s trapped-ion QC (1/8)
Posted 1 year ago (17 November 2022)