amitsangani
689c7f261b
Update README.md
...
Modified from Llama Recipes to Llama Cookbook.
2025-01-26 13:42:26 -08:00
Joseph Spisak
8fac8befd7
Update README.md
2024-07-23 07:50:27 -07:00
Samuel Selvan
227d378a77
Merge pull request #1125 from hyungupark/patch-1
...
Update download.sh
2024-07-22 20:59:46 -07:00
Samuel Selvan
66bc7307da
Update download.sh
2024-07-22 18:20:54 -07:00
Samuel Selvan
12b676b909
Update download.sh
2024-07-22 18:15:37 -07:00
hyungupark
c0098be87a
Update download.sh
...
modify for CPU_ARCH not found
2024-05-15 12:49:24 +09:00
Joseph Spisak
be327c427c
Merge pull request #1124 from dandv/patch-1
...
README: LLama 2 is no longer the latest version
2024-05-14 15:31:19 -07:00
Dan Dascalescu
893ff972e1
README: LLama 2 is no longer the latest version
2024-05-15 00:53:25 +03:00
Samuel Selvan
b8348da38f
Merge pull request #1079 from MattGurney/fix-model-card
...
Update MODEL_CARD.md
2024-04-09 09:17:49 -07:00
Samuel Selvan
04b200c5dc
Merge pull request #1091 from osanseviero/patch-1
...
Update Hugging Face Hub instructions
2024-04-09 09:16:48 -07:00
Omar Sanseviero
fd7308965b
Update README.md
2024-04-08 16:12:21 +02:00
MattGurney
1f9a8d774a
Update MODEL_CARD.md
...
Move word "Uses" into markdown header.
2024-03-23 19:03:21 +11:00
Suraj Subramanian
54c22c0d63
Merge pull request #1077 from mst272/main
...
update the code to use the module's __call__ (Issue #1055 )
2024-03-21 11:50:25 -04:00
wangzhihong
1e8375848d
update the code to use the module's __call__
2024-03-21 10:09:34 +08:00
Joseph Spisak
52afd48b06
Merge pull request #1076 from meta-llama/jspisak-patch-7
...
Update README.md
2024-03-20 10:55:51 -07:00
Joseph Spisak
826ad1198c
Update README.md
2024-03-20 10:50:59 -07:00
Joseph Spisak
2f58b8d7b6
Merge pull request #1063 from jeffxtang/LLaMA_lowercase
...
change LLaMA to Llama in README
2024-03-13 10:23:21 -07:00
Jeff Tang
0b466166ee
change LLaMA to Llama in README
2024-03-13 10:18:18 -07:00
Joseph Spisak
9a001c7a09
Merge pull request #1058 from shorthills-ai/main
...
Update README.md ( Undo Install command changes)
2024-03-05 19:28:07 -08:00
Shorthills AI
11ebe80305
Update README.md
...
Undo the pip install e. changes
2024-03-06 08:06:47 +05:30
Joseph Spisak
a0a4da8b49
Merge pull request #1053 from shorthills-ai/main
...
Update README.md - Fixed some minor grammatical issues.
2024-03-01 06:22:31 -08:00
Shorthills AI
acdb925413
Update README.md
2024-03-01 12:53:48 +05:30
Joseph Spisak
6796a91789
Merge pull request #1046 from facebookresearch/update-contributing_guide
...
Updating contributor guide
2024-02-28 11:05:52 -08:00
Navyata Bawa
c28bdb58c4
Updating contributor guide
2024-02-28 10:55:21 -08:00
Joseph Spisak
3f61918123
Merge pull request #1033 from ryanhankins/patch-1
...
Update README.md
2024-02-22 20:17:53 -08:00
ryanhankins
53b227b532
Update README.md
...
Repair URL to link to Llama examples safety checker. The existing URL was out of date.
2024-02-21 16:39:41 -06:00
ruanslv
ef351e9cd9
Merge pull request #900 from flu0r1ne/main
...
Fix key-value caching for seqlen != 1 (Issue #899 )
2023-11-13 21:22:55 -05:00
flu0r1ne
cd0719ddb4
Correct KV comment seqlen -> seqlen + cache_len
...
Update and add comments about the shape of the key and value
matrices in the attention component. E.g., the second dimension is
of length seqlen + cache_len not seqlen as previously stated.
2023-11-13 14:05:24 -06:00
Alex
6b3154bfbb
Update transformer mask comment
...
Update names for consistency with code
Co-authored-by: ruanslv <ruanslv@gmail.com >
2023-11-13 13:41:06 -06:00
Joseph Spisak
4835a30a1c
Merge pull request #916 from facebookresearch/jspisak-patch-6
...
Update README.md
2023-11-10 07:39:29 -08:00
Joseph Spisak
94b055f4ae
Update README.md
2023-11-10 07:38:39 -08:00
Suraj Subramanian
dccf644213
fix faq link
2023-11-08 08:47:11 -05:00
Suraj Subramanian
9cd8d505ca
Update issue templates
2023-11-08 08:13:08 -05:00
flu0r1ne
e9077bd241
Fix key-value caching for seqlen != 1
...
This commit fixes a bug in the key-value caching. Currently,
a square attention mask is misapplied to the scores matrix
despite not matching the shape of the scores matrix. This
results in a runtime error. In a correct implementation, the
decoder mask needs to describe how the new seq_len tokens
interact with all the cached tokens. That is, the attention
mask needs to be of shape (seq_len, total_len), indicating how
the token at row i (representing token i + cached_len in the
transformer model) attends to token j. Accordingly, the matrix
needs to mask entries where j > cached_len + i. This patch
horizontally appends (seq_len, cached_len) zeros to an
upper-triangular mask of size (seq_len, seq_len) to form the
(seq_len, total_len) mask.
2023-11-02 19:33:26 -05:00
Joseph Spisak
54d4463105
Merge pull request #897 from JacobHelwig/main
...
Correct "bug," typo to "bug", in README.md
2023-11-02 10:53:24 -07:00
JacobHelwig
7909dee4a8
Correct "bug," typo to "bug", in README.md
2023-11-02 12:47:01 -05:00
Joseph Spisak
b5cd38ad2d
Merge pull request #891 from facebookresearch/jspisak-patch-5
...
Delete FAQ.md
2023-11-01 20:48:24 -07:00
Joseph Spisak
664ddc8c8f
Delete FAQ.md
2023-11-01 20:47:17 -07:00
Joseph Spisak
3f750f4c4d
Merge pull request #890 from facebookresearch/jspisak-patch-4
...
Update README.md
2023-11-01 20:45:20 -07:00
Joseph Spisak
786af96785
Update README.md
2023-11-01 20:44:47 -07:00
Suraj Subramanian
06faf3aab2
Add FAQs
2023-10-18 13:38:04 -04:00
Joseph Spisak
1c95a19e8c
Merge pull request #860 from facebookresearch/add-issue-template
...
Update issue templates
2023-10-16 11:54:53 -07:00
Suraj Subramanian
0cc2987b61
Update issue templates
2023-10-16 14:51:11 -04:00
Joseph Spisak
6b8cff0e0c
Merge pull request #859 from yonashub/patch-1
...
[closes #858 ] change "Content Length" to "Context Length MODEL_CARD.md
2023-10-14 20:03:02 -07:00
yonashub
f9ddb1d0f3
change "Content Length" to "Context Length MODEL_CARD.md
2023-10-14 21:30:47 -05:00
Joseph Spisak
556949fdfb
Merge pull request #851 from sekyondaMeta/FAQ-updates
...
Faq updates
2023-10-11 12:18:13 -07:00
Joseph Spisak
0da077cff6
Update FAQ.md
...
made some small fixes and added some context.
2023-10-11 12:17:35 -07:00
sekyondaMeta
5d9bb58a65
Update FAQ.md
2023-10-11 15:08:34 -04:00
sekyondaMeta
98851c3009
Update FAQ.md
2023-10-11 15:05:15 -04:00
samuelselvan
7e1b864d57
Merge pull request #822 from kierenAW/main
...
Add "--continue" flag to wget for model binary in order to resume dl
2023-09-29 09:32:40 -07:00