Transformers for Natural Language Processing and Computer Vision: Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3
D**T
An absolute must-buy
Having bought the first two editions of Rothman's seminal Transformers for Natural Language Processing book, I was super-excited to get this third edition. Upon receiving, the first thing that struck me was that is a *big*, heavy, book - with some 20 chapters and 700 odd pages. No bad thing.As one might expect some of the book is the same content from the previous books (e.g., the opening few chapters on what transformers are, and their architecture), but the vast majority of the book appears to be new content. Clearly technology has moved rapidly since the last release, hence there are chapters on RAG, Vertex AI, and Palm.Also all-new are the chapters on transformers for vision (hence the name change of the book to Transformers for Natural Language Processing and Computer Vision). In this new section, Rothman describes some of the most powerful CV transformers, discusses stable diffusion, and explains how to use Hugging Face's Autotrain. However, this reader was intrigued by the chapter on "generative ideation" in which Rothman postulates the concept of generating ideas and content without human intervention. Fascinating stuff.In summary, Rothman is clearly an expert in this area. The book is written in an easy-to-read manner, with lots of examples and good explanations of some quite complicated concepts. If you are serious about transformers then this is simply a must-buy book, written by the foremost authority in this area. Highly recommended.
R**V
Great read - Highly recommended.
A book that is easy to follow along and understand the concepts. Denis , you have outdone yourself. It is relevant , the topics covered are exciting and have certainly made me enthusiastic and knowledgeable about this Foundation Models.Chapeau to you sir and I will definitely recommend your book to others.
N**S
Reasonable content but poor phrasing throughout
I'm pleased I bought it but I found the writing really poorly phrased: it feels like there has been no effort by the editor to make the content flow like natural English and it's at a level that actually hinders understanding. Particularly in early chapters there are justifications that feel like the whole point is lost in translation.No doubt this can be remedied in a second edition
S**N
Excellent!
Transformers for Natural Language Processing and Computer Vision is a fully powered pack by the author in the form of a third edition.The author provides three parts in this book, which is well designed for the reader’s perspective.Part I: The Foundations of Transformers:As the name implies, the author introduces what Transformers and Foundation Models are and explains them at a high level.This is followed by the background of NLP, which helped us understand how RNN, LSTM, and CNN architectures were abandoned and how the transformer architecture opened a new era.The author’s explanation of the transformer’s unseen depths bridges the gap between their functional and mathematical architecture by introducing emergence with a detailed walkthrough and neat diagram and code. Additionally, the potential to downstream tasks and evaluate models with metrics such as Accuracy score, F1-score, and MCC.The author covers the advancements in translations with Google Trax, Google Translate, and Gemini. He then goes through the three steps of machine translation and how to implement machine translations, which are significant topics for reading and understanding the concepts.In and out of BERT, the explanation of the RoBERTA transformer model is impressive. The author explains how we will pre-train a Hugging Face RobertaForCausalLM model to be a Generative AI customer support chat agent for X (formerly Twitter). RoBERTa is an encoder-only model.Part II: The Rise of Suprahuman NLP:Yes! The author of “The Rise of Suprahuman” describes how the Generative AI Revolution with ChatGPT involves tremendous improvements and the diffusion of ChatGPT models into the everyday lives of developers and end-users.The implementation of automated RAG with GPT-4 is a significant part of this book, followed by the author, helping us to fine-tune OpenAI GPT Models with neat and crystal-clear steps; you must read and try this yourself.The author’s explanation of tokenisers' role in shaping transformer models is very interesting. His note on tokenizer-agnostic best practices for measuring a tokenizer's quality is a key takeaway. His guidelines for datasets and tokenizers from a tokenization perspective are notable.Leveraging LLM Embeddings as an Alternative to Fine-Tuning chapters is remarkable and highly impactful.How Toward Syntax-Free Semantic Role Labeling with ChatGPT and GPT-4 goes through the revolutionary concepts of syntax-free, nonrepetitive stochastic models is incredible. The author’s explanation of and work on how to use ChatGPT Plus with GPT-4 to run easy to complex Semantic Role Labeling (SRL) samples.The next major takeaway from this book is T5, ChatGPT - concepts and architecture model, and how to apply T5 to summarize documents with Hugging Face models.The author also explores cutting-edge LLMs with Vertex AI and PaLM 2. And how Google PaLM 2 can perform a chat task, a discriminative task (such as classification), a completion task (also known as a generative task), and more, which are very useful.Everyone is responsible for mitigating risks in large language models. The author covers how to examine the risks of LLMs, risk management, and risk mitigation tools.Part III: Generative Computer Vision:The author explores how to see the world beyond the text using vision transformers in the Dawn of Revolutionary AI. He explores innovative transformer models such as ViT, CLIP, DALL-E, and GPT-4V and their implementation, including GPT-4V. He also expands the text-image interactions of DALL-3 to divergent semantic association.The author helps us go through each of the main components of a Stable Diffusion model, peek into the source code provided by Keras, run the model, and run a text-to-video synthesis model with Hugging Face and a video-to-text task with Meta’s TimeSformer.The author’s guide to Hugging Face AutoTrain will help us understand how to train vision models and how to train a vision transformer using Hugging Face’s AutoTrain. He also discusses the road to functional AGI with HuggingGPT and its Peers. It shows how we can use cross-platform chained models to solve difficult image classification problems.Overall … I can give 4.5/5.0 for this. Indeed, an extraordinary effort from the author is much appreciated.-Shanthababu PandianAI and Data Architect | Scrum Master | National and International Speaker | Blogger | Author
M**A
Good reference for my research
I am working on research that uses transformers with Natural Language Processing and Computer Vision. I find this book excellent as a reference and information. I like it.
S**S
Detailed and approachable book on AI
I really enjoyed the book, I thought it was very thorough and well explained. It covers everything you would want to see in a book on NLP and CV.
Trustpilot
2 months ago
1 day ago