Skip to content

Commit eb9a410

Browse files
committed
ai accelerator homework solutions
1 parent 0b461d0 commit eb9a410

File tree

1 file changed

+29
-1
lines changed

1 file changed

+29
-1
lines changed

07_AITestbeds/README.md

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ We will cover an overview of the AI accelerators landscape with a focus on Samba
99

1010
## Slides
1111

12-
* [Intro to AI Series: AI Accelerators](./AI_Accelerators.pdf)
12+
* [Intro to AI Series: AI Accelerators]()
13+
> Slides will be uploaded shortly after the talk.
1314
1415
## Hands-On Sessions
1516

@@ -36,6 +37,33 @@ You need to submit either Theory Homework or Hands-on Homework.
3637
* Based on hands-on sessions, describe a typical workflow for refactoring an AI model to run on one of ALCF's AI testbeds (e.g., SambaNova or Cerebras). What tools or software stacks are typically used in this process?
3738
* Give an example of a project that would benefit from AI accelerators and why?
3839

40+
41+
<details>
42+
<summary>Theory Homework Solutions</summary>
43+
44+
1. **What are the key architectural features that make these systems suitable for AI workloads?**
45+
The key architectural features that make AI accelerators like SambaNova, Cerebras, Graphcore, and Groq systems suitable for AI workloads are:
46+
1. Specialized Hardware Design to accelerate matrix multiplications and tensor operations.
47+
2. High Memory Bandwidth and larger amount of on-chip memory help to accelerate memory intensive AI worklaods.
48+
3. Scalability and Parallelism: Parallel processing of data across many cores or processing units, which significantly speeds up training and inference tasks
49+
50+
51+
2. **Identify the primary differences between these AI accelerator systems in terms of their architecture and programming models.**
52+
53+
1. Sambanovas Reconfigurable Dataflow Unit (RDU) allows for flexible dataflow processing that features a multi-tiered memory architecture with terabytes of addressable memory for efficinet handling of large data.
54+
2. Cerebras Wafer-Scale Engine (WSE) consists of processing elements (PEs) with its own memory and operates independently. Fine-grained dataflow control mechanism within its PEs make the system highly parallel and scalable.
55+
3. Graphcore’s Intelligence Processing Unit (IPU) consists of many interconnected processing tiles, each with its own core and local memory. The IPU operates in two phases—computation and communication—using Bulk Synchronous Parallelism (BSP).
56+
4. Groq’s Tensor Streaming Processor (TSP) architecture focuses on deterministic execution which s particularly advantageous for inference tasks where low latency is critical.
57+
58+
59+
3. **Based on hands-on sessions, describe a typical workflow for refactoring an AI model to run on one of ALCF's AI testbeds (e.g., SambaNova or Cerebras). What tools or software stacks are typically used in this process?**
60+
61+
Typical worksflow involves using vendor specific implementation of ML framework like PyTorch to port model. Refer to following documentation examples to understand details of workflow.
62+
* [PyTroch to PopTroch](https://docs.graphcore.ai/projects/poptorch-user-guide/en/latest/pytorch_to_poptorch.html)
63+
* [Sambaflow Model Conversion](https://docs.sambanova.ai/developer/latest/porting-overview.html)
64+
</details>
65+
66+
3967
##### Hands-on Homework
4068

4169
* [Cerebras Homework](./Cerebras/README.md#homework)

0 commit comments

Comments
 (0)