Commit Graph

  • 39f75c7001 Merge pull request #2945 from marko1616/bugfix/lora-model-merge hoshi-hiyouga 2024-03-25 13:36:08 +08:00
  • 7f99cb1817 pass ruff check marko1616 2024-03-24 16:12:10 +08:00
  • c555b2cce3 fix Llama lora merge crash marko1616 2024-03-24 03:06:11 +08:00
  • 2eba1c6851 fix Llama lora merge crash marko1616 2024-03-24 02:55:23 +08:00
  • edeed55664 fix Llama lora merge crash marko1616 2024-03-24 02:44:35 +08:00
  • 92248f9cb2 fix #2936 hiyouga 2024-03-24 00:43:21 +08:00
  • c548ad5e69 fix #2928 hiyouga 2024-03-24 00:34:54 +08:00
  • a57d839e1d fix #2941 hiyouga 2024-03-24 00:28:44 +08:00
  • d88a34bc79 Merge pull request #2919 from 0xez/main hoshi-hiyouga 2024-03-22 12:12:24 +08:00
  • 60cbc9d0e5 Update README_zh.md, fix the release date of the paper 0xez 2024-03-22 10:41:17 +08:00
  • d5005e766f Update README.md, fix the release date of the paper 0xez 2024-03-21 22:14:48 +08:00
  • 4d0753cffe move file hiyouga 2024-03-21 17:05:17 +08:00
  • 1cf0f11840 add citation hiyouga 2024-03-21 17:04:10 +08:00
  • 052e8b2cc6 paper release hiyouga 2024-03-21 13:49:17 +08:00
  • 8963e89633 update readme hiyouga 2024-03-21 00:48:42 +08:00
  • 935ee0a023 support fsdp + qlora hiyouga 2024-03-21 00:36:06 +08:00
  • 5ed234ca63 add orca_dpo_pairs dataset hiyouga 2024-03-20 20:09:06 +08:00
  • 04884a0911 Merge pull request #2905 from SirlyDreamer/main hoshi-hiyouga 2024-03-20 18:09:54 +08:00
  • c7af26a9e3 fix #2777 #2895 hiyouga 2024-03-20 17:59:45 +08:00
  • d8073488be fix #2346 hiyouga 2024-03-20 17:56:33 +08:00
  • 6fc2d7e063 Follow HF_ENDPOINT environment variable SirlyDreamer 2024-03-20 08:31:30 +00:00
  • e93c7cdb80 Updated README with new information khazic 2024-03-20 14:38:08 +08:00
  • c32d6c8250 Updated README with new information khazic 2024-03-20 14:21:16 +08:00
  • 757158da63 Updated README with new information 刘一博 2024-03-20 14:11:28 +08:00
  • ffdacaa618 fix packages hiyouga 2024-03-17 22:32:03 +08:00
  • e194efab10 fix patcher hiyouga 2024-03-15 19:18:42 +08:00
  • 772fc2eac7 Merge pull request #2849 from S3Studio/DockerizeSupport hoshi-hiyouga 2024-03-15 19:16:02 +08:00
  • ed020579dc fix export hiyouga 2024-03-15 15:06:30 +08:00
  • 096869c7b6 Use official Nvidia base image S3Studio 2024-03-14 18:03:33 +08:00
  • c6873211e9 improve Docker build and runtime parameters S3Studio 2024-03-12 14:05:10 +08:00
  • 623ee1bd88 tiny fix hiyouga 2024-03-14 21:19:06 +08:00
  • aabe90343e fix export hiyouga 2024-03-14 18:17:01 +08:00
  • 764cfb506d fix bug hiyouga 2024-03-13 23:55:31 +08:00
  • 249ad56075 fix bug hiyouga 2024-03-13 23:43:42 +08:00
  • 46f99ff277 improve lora+ impl. hiyouga 2024-03-13 23:32:51 +08:00
  • 73f4513c84 Merge pull request #2830 from qibaoyuan/lora_plus hoshi-hiyouga 2024-03-13 20:15:46 +08:00
  • 3c91e86268 [FEATURE]: ADD LORA+ ALGORITHM 齐保元 2024-03-13 19:43:27 +08:00
  • 42473ec150 fix #2817 hiyouga 2024-03-13 12:42:03 +08:00
  • 6a4e4b9c5b fix #2802 hiyouga 2024-03-13 12:33:45 +08:00
  • 9a784fb4f3 fix kv cache hiyouga 2024-03-13 01:21:50 +08:00
  • 43fd80a1aa support QDoRA hiyouga 2024-03-12 22:12:42 +08:00
  • e6ab1a57ea patch for gemma cpt hiyouga 2024-03-12 21:21:54 +08:00
  • 282edb9161 fix plot issues hiyouga 2024-03-12 18:41:35 +08:00
  • dff77004f2 support olmo hiyouga 2024-03-12 18:30:38 +08:00
  • 6c1b4aec75 fix #2802 hiyouga 2024-03-12 17:08:34 +08:00
  • 7814db1b42 fix #2803 hiyouga 2024-03-12 16:57:39 +08:00
  • c9ed3fc3a4 fix #2782 #2798 hiyouga 2024-03-12 15:53:29 +08:00
  • 9ee416a8fc Merge pull request #2743 from S3Studio/DockerizeSupport hoshi-hiyouga 2024-03-12 00:05:49 +08:00
  • 4f9a47c026 fix #2775 hiyouga 2024-03-11 00:42:54 +08:00
  • 3fcb1c6d09 tiny fix hiyouga 2024-03-11 00:17:18 +08:00
  • 7c492864e9 update parser hiyouga 2024-03-10 13:35:20 +08:00
  • 7ff8a064f3 support layerwise galore hiyouga 2024-03-10 00:24:11 +08:00
  • c635bbe465 fix #2732 hiyouga 2024-03-09 22:37:16 +08:00
  • 4881f4e631 allow non-packing pretraining hiyouga 2024-03-09 22:21:46 +08:00
  • c631799f5d fix #2766 hiyouga 2024-03-09 21:35:24 +08:00
  • 48846676d8 use default arg for freeze tuning hiyouga 2024-03-09 06:08:48 +08:00
  • f37d481c5d add GaLore results hiyouga 2024-03-09 04:11:55 +08:00
  • 5d7d8bd55c update hardware requirements hiyouga 2024-03-09 03:58:18 +08:00
  • 8ed1463236 update examples hiyouga 2024-03-09 02:30:37 +08:00
  • 43b2ede0f8 fix #2756 , patch #2746 hiyouga 2024-03-09 02:01:26 +08:00
  • 2f095e2017 Merge pull request #2746 from stephen-nju/main hoshi-hiyouga 2024-03-09 01:37:00 +08:00
  • 9b55bb964c Update setup.py hiyouga 2024-03-09 00:14:48 +08:00
  • 9b97b23ce7 fix aqlm version hiyouga 2024-03-09 00:09:09 +08:00
  • 53ab28533e fix example params hiyouga 2024-03-08 20:41:43 +08:00
  • 940c00e7ae update stephen_zhu 2024-03-08 12:47:44 +08:00
  • 18cfd5f349 fix ppo runtime error stephen 2024-03-08 11:48:26 +08:00
  • 6169df1c52 Add dockerize support S3Studio 2024-03-08 10:47:28 +08:00
  • d46c2bbcba update readme hiyouga 2024-03-08 03:06:21 +08:00
  • 48d4364586 fix chat engine, update webui hiyouga 2024-03-08 03:01:53 +08:00
  • 8042c66a76 Update setup.py hiyouga 2024-03-08 01:23:00 +08:00
  • 3879d79b89 update galore args hiyouga 2024-03-08 01:17:32 +08:00
  • e416cecf62 fix galore hiyouga 2024-03-08 00:44:51 +08:00
  • 81fcb80466 add Yi-9B model hiyouga 2024-03-07 23:11:57 +08:00
  • bf812fbe40 add galore examples hiyouga 2024-03-07 22:53:45 +08:00
  • 1e6fb6c8aa support galore hiyouga 2024-03-07 22:41:36 +08:00
  • 5d0c95bd02 update readme hiyouga 2024-03-07 20:34:49 +08:00
  • 7cd2417002 tiny fix hiyouga 2024-03-07 20:29:34 +08:00
  • 16851d66e5 Merge pull request #2739 from hiyouga/dev-vllm hoshi-hiyouga 2024-03-07 20:28:18 +08:00
  • 056d2d956a support vllm hiyouga 2024-03-07 20:26:31 +08:00
  • 9a69cadab3 fix #2735 hiyouga 2024-03-07 16:15:53 +08:00
  • 3de642bffd Merge pull request #2730 from cx2333-gt/main hoshi-hiyouga 2024-03-07 14:37:18 +08:00
  • 286b9d9849 revert choice name cx2333 2024-03-07 14:28:55 +08:00
  • cef1ede826 fix chatglm3 template hiyouga 2024-03-07 14:26:16 +08:00
  • 5007566588 fix flash_attn in train_web cx2333 2024-03-07 10:13:55 +08:00
  • e93fb3cc6c tiny fix hiyouga 2024-03-06 17:25:08 +08:00
  • 7578209735 export use balanced gpu hiyouga 2024-03-06 16:33:14 +08:00
  • 67f02f75d0 fix add tokens hiyouga 2024-03-06 15:04:02 +08:00
  • 73d9dfc7ab fix version checking hiyouga 2024-03-06 14:51:51 +08:00
  • 6b407092d9 update examples hiyouga 2024-03-06 13:14:57 +08:00
  • 3168abc0a1 fix arg dtype hiyouga 2024-03-05 20:53:30 +08:00
  • 46ee267cfc improve aqlm optim hiyouga 2024-03-05 20:49:50 +08:00
  • a10bead9b5 optimize aqlm training hiyouga 2024-03-05 18:35:41 +08:00
  • 3553e301dd fix dora inference hiyouga 2024-03-05 11:51:41 +08:00
  • 02b838b9b0 fix export model hiyouga 2024-03-05 11:05:41 +08:00
  • b1de6d1025 update readme hiyouga 2024-03-05 03:20:23 +08:00
  • bc67872218 add examples hiyouga 2024-03-05 03:16:35 +08:00
  • 0229fffde5 auto set chat template hiyouga 2024-03-05 02:41:20 +08:00
  • 3555b87363 update readme hiyouga 2024-03-04 19:29:26 +08:00
  • 2dca53962e fix export on cpu device hiyouga 2024-03-04 17:35:09 +08:00
  • f4f71f2797 fix sub-process error in thread hiyouga 2024-03-03 15:04:35 +08:00