148 Commits

Author SHA1 Message Date
amitsangani
689c7f261b Update README.md
Modified from Llama Recipes to Llama Cookbook.
2025-01-26 13:42:26 -08:00
Joseph Spisak
8fac8befd7 Update README.md 2024-07-23 07:50:27 -07:00
Samuel Selvan
227d378a77 Merge pull request #1125 from hyungupark/patch-1
Update download.sh
2024-07-22 20:59:46 -07:00
Samuel Selvan
66bc7307da Update download.sh 2024-07-22 18:20:54 -07:00
Samuel Selvan
12b676b909 Update download.sh 2024-07-22 18:15:37 -07:00
hyungupark
c0098be87a Update download.sh
modify for CPU_ARCH not found
2024-05-15 12:49:24 +09:00
Joseph Spisak
be327c427c Merge pull request #1124 from dandv/patch-1
README: LLama 2 is no longer the latest version
2024-05-14 15:31:19 -07:00
Dan Dascalescu
893ff972e1 README: LLama 2 is no longer the latest version 2024-05-15 00:53:25 +03:00
Samuel Selvan
b8348da38f Merge pull request #1079 from MattGurney/fix-model-card
Update MODEL_CARD.md
2024-04-09 09:17:49 -07:00
Samuel Selvan
04b200c5dc Merge pull request #1091 from osanseviero/patch-1
Update Hugging Face Hub instructions
2024-04-09 09:16:48 -07:00
Omar Sanseviero
fd7308965b Update README.md 2024-04-08 16:12:21 +02:00
MattGurney
1f9a8d774a Update MODEL_CARD.md
Move word "Uses" into markdown header.
2024-03-23 19:03:21 +11:00
Suraj Subramanian
54c22c0d63 Merge pull request #1077 from mst272/main
update the code to use the module's __call__ (Issue #1055)
2024-03-21 11:50:25 -04:00
wangzhihong
1e8375848d update the code to use the module's __call__ 2024-03-21 10:09:34 +08:00
Joseph Spisak
52afd48b06 Merge pull request #1076 from meta-llama/jspisak-patch-7
Update README.md
2024-03-20 10:55:51 -07:00
Joseph Spisak
826ad1198c Update README.md 2024-03-20 10:50:59 -07:00
Joseph Spisak
2f58b8d7b6 Merge pull request #1063 from jeffxtang/LLaMA_lowercase
change LLaMA to Llama in README
2024-03-13 10:23:21 -07:00
Jeff Tang
0b466166ee change LLaMA to Llama in README 2024-03-13 10:18:18 -07:00
Joseph Spisak
9a001c7a09 Merge pull request #1058 from shorthills-ai/main
Update README.md ( Undo Install command changes)
2024-03-05 19:28:07 -08:00
Shorthills AI
11ebe80305 Update README.md
Undo the pip install e. changes
2024-03-06 08:06:47 +05:30
Joseph Spisak
a0a4da8b49 Merge pull request #1053 from shorthills-ai/main
Update README.md - Fixed some minor grammatical issues.
2024-03-01 06:22:31 -08:00
Shorthills AI
acdb925413 Update README.md 2024-03-01 12:53:48 +05:30
Joseph Spisak
6796a91789 Merge pull request #1046 from facebookresearch/update-contributing_guide
Updating contributor guide
2024-02-28 11:05:52 -08:00
Navyata Bawa
c28bdb58c4 Updating contributor guide 2024-02-28 10:55:21 -08:00
Joseph Spisak
3f61918123 Merge pull request #1033 from ryanhankins/patch-1
Update README.md
2024-02-22 20:17:53 -08:00
ryanhankins
53b227b532 Update README.md
Repair URL to link to Llama examples safety checker.  The existing URL was out of date.
2024-02-21 16:39:41 -06:00
ruanslv
ef351e9cd9 Merge pull request #900 from flu0r1ne/main
Fix key-value caching for seqlen != 1 (Issue #899)
2023-11-13 21:22:55 -05:00
flu0r1ne
cd0719ddb4 Correct KV comment seqlen -> seqlen + cache_len
Update and add comments about the shape of the key and value
matrices in the attention component. E.g., the second dimension is
of length seqlen + cache_len not seqlen as previously stated.
2023-11-13 14:05:24 -06:00
Alex
6b3154bfbb Update transformer mask comment
Update names for consistency with code

Co-authored-by: ruanslv <ruanslv@gmail.com>
2023-11-13 13:41:06 -06:00
Joseph Spisak
4835a30a1c Merge pull request #916 from facebookresearch/jspisak-patch-6
Update README.md
2023-11-10 07:39:29 -08:00
Joseph Spisak
94b055f4ae Update README.md 2023-11-10 07:38:39 -08:00
Suraj Subramanian
dccf644213 fix faq link 2023-11-08 08:47:11 -05:00
Suraj Subramanian
9cd8d505ca Update issue templates 2023-11-08 08:13:08 -05:00
flu0r1ne
e9077bd241 Fix key-value caching for seqlen != 1
This commit fixes a bug in the key-value caching. Currently,
a square attention mask is misapplied to the scores matrix
despite not matching the shape of the scores matrix. This
results in a runtime error. In a correct implementation, the
decoder mask needs to describe how the new seq_len tokens
interact with all the cached tokens. That is, the attention
mask needs to be of shape (seq_len, total_len), indicating how
the token at row i (representing token i + cached_len in the
transformer model) attends to token j. Accordingly, the matrix
needs to mask entries where j > cached_len + i. This patch
horizontally appends (seq_len, cached_len) zeros to an
upper-triangular mask of size (seq_len, seq_len) to form the
(seq_len, total_len) mask.
2023-11-02 19:33:26 -05:00
Joseph Spisak
54d4463105 Merge pull request #897 from JacobHelwig/main
Correct "bug," typo to "bug", in README.md
2023-11-02 10:53:24 -07:00
JacobHelwig
7909dee4a8 Correct "bug," typo to "bug", in README.md 2023-11-02 12:47:01 -05:00
Joseph Spisak
b5cd38ad2d Merge pull request #891 from facebookresearch/jspisak-patch-5
Delete FAQ.md
2023-11-01 20:48:24 -07:00
Joseph Spisak
664ddc8c8f Delete FAQ.md 2023-11-01 20:47:17 -07:00
Joseph Spisak
3f750f4c4d Merge pull request #890 from facebookresearch/jspisak-patch-4
Update README.md
2023-11-01 20:45:20 -07:00
Joseph Spisak
786af96785 Update README.md 2023-11-01 20:44:47 -07:00
Suraj Subramanian
06faf3aab2 Add FAQs 2023-10-18 13:38:04 -04:00
Joseph Spisak
1c95a19e8c Merge pull request #860 from facebookresearch/add-issue-template
Update issue templates
2023-10-16 11:54:53 -07:00
Suraj Subramanian
0cc2987b61 Update issue templates 2023-10-16 14:51:11 -04:00
Joseph Spisak
6b8cff0e0c Merge pull request #859 from yonashub/patch-1
[closes #858] change "Content Length" to "Context Length  MODEL_CARD.md
2023-10-14 20:03:02 -07:00
yonashub
f9ddb1d0f3 change "Content Length" to "Context Length MODEL_CARD.md 2023-10-14 21:30:47 -05:00
Joseph Spisak
556949fdfb Merge pull request #851 from sekyondaMeta/FAQ-updates
Faq updates
2023-10-11 12:18:13 -07:00
Joseph Spisak
0da077cff6 Update FAQ.md
made some small fixes and added some context.
2023-10-11 12:17:35 -07:00
sekyondaMeta
5d9bb58a65 Update FAQ.md 2023-10-11 15:08:34 -04:00
sekyondaMeta
98851c3009 Update FAQ.md 2023-10-11 15:05:15 -04:00
samuelselvan
7e1b864d57 Merge pull request #822 from kierenAW/main
Add "--continue" flag to wget for model binary in order to resume dl
2023-09-29 09:32:40 -07:00