cohere-ai · invader89 · Oct 30, 2024 · Oct 30, 2024
@@ -50,26 +50,26 @@ co = cohere.Client(api_key='Your API key')
 
 ### Dataset Creation
 
-Datasets are created by uploading files, specifying both a `name` for the dataset and the `dataset_type`.
+Datasets are created by uploading files, specifying both a `name` for the dataset and the dataset `type`.
 
-The file extension and file contents have to match the requirements for the selected `dataset_type`. See the table below to learn more about the supported dataset types. 
+The file extension and file contents have to match the requirements for the selected dataset `type`. See the table below to learn more about the supported dataset types. 
 
 The dataset `name` is useful when browsing the datasets you've uploaded. In addition to its name, each dataset will also be assigned a unique `id` when it's created.
 
-Here is an example code snippet illustrating the process of creating a dataset, with both the `name` and the `dataset_type` specified.
+Here is an example code snippet illustrating the process of creating a dataset, with both the `name` and the dataset `type` specified.
 
 ```python PYTHON
 my_dataset = co.datasets.create(
 	name="shakespeare",
 	data=open("./shakespeare.jsonl", "rb"),
-	dataset_type="chat-finetune-input")
+	type="chat-finetune-input")
 
 print(my_dataset.id)
 ```
 
 ### Dataset Validation
 
-Whenever a dataset is created, the data is validated asynchronously against the rules for the specified `dataset_type` . This validation is kicked off automatically on the backend, and must be completed before a dataset can be used with other endpoints.
+Whenever a dataset is created, the data is validated asynchronously against the rules for the specified dataset `type` . This validation is kicked off automatically on the backend, and must be completed before a dataset can be used with other endpoints.
 
 Here's a code snippet showing how to check the validation status of a dataset you've created.
 
@@ -114,10 +114,9 @@ ds=co.datasets.create(
 	name='sample_file',
 	# insert your file path here - you can upload it on the right - we accept .csv and jsonl files
 	data=open('embed_jobs_sample_data.jsonl', 'rb'),
-	keep_fields=['wiki_id','url','views','title']
-	optional_fields=['langs']
-	dataset_type="embed-input",
-  embedding_types=['float']
+	keep_fields=['wiki_id','url','views','title'],
+	optional_fields=['langs'],
+	type="embed-input"
 	)
 
 # wait for the dataset to finish validation
@@ -134,18 +133,25 @@ In the example below, we will create a new dataset and upload an evaluation set
 # create a dataset
 my_dataset = co.datasets.create(
 	name="shakespeare",
-	dataset_type="chat-finetune-input",
-	data=open("./shakespeare.csv", "rb"),
-  eval_data=open("./shakespeare-eval.csv", "rb")
+	type="chat-finetune-input",
+	data=open("./shakespeare.jsonl", "rb"),
+        eval_data=open("./shakespeare-eval.jsonl", "rb")
 )
 
 co.wait(my_dataset)
 
 # start training a custom model using the dataset
 co.finetuning.create_finetuned_model(
-	name="shakespearean-model", 
-	model_type="GENERATIVE", 
-	dataset=my_dataset)
+	request=FinetunedModel(
+        name="shakespearean-model",
+        settings=Settings(
+            base_model=BaseModel(
+                base_type="BASE_TYPE_CHAT",
+            ),
+            dataset_id=my_dataset.id
+        ),
+    )
+)
 ```
 
 ### Dataset Types
@@ -195,7 +201,7 @@ Here is an example code snippet showing how to fetch a dataset by its unique `id
 my_dataset = co.datasets.get(id="<DATASET_ID>")
 
 # print each entry in the dataset
-for record in my_dataset.open():
+for record in my_dataset:
   print(record)
 
 # save the dataset as jsonl