본문 바로가기

Industries

[NPU] Tenstorrents vs. Rebellions

by 올드뉴스 2024. 8. 26.

GPU 기반의 ML, DL 의 학습에서 NPU 기반의 가속기가 주목받고 있다.

AI 학습은 다음 같이 학습과 추론으로 나눌 수 있다.

https://www.irsglobal.com/bbs/rwdboard/20696, 자료 : 노무라증권투자정보부

NVIDIA, AMD의 GPU 기반 AI 학습은 대량의 원시 데이터를 학습해 추론하는 방식이다.

NPU 들은 학습한 모델을 기반으로 적은량의 학습 데이터로 추론을 목적으로 한다.

학습과 추론 차이, https://manchann.tistory.com/16

NPU는 소형, 저전력 을 사용해야 하는 자동차, 항공기 등 실생활 Edge AI 장치에 유망하다.

https://www.irsglobal.com/bbs/rwdboard/20696

Tenstorrents

캐나다의 RISC-V 기반 AI 가속기 하드웨어 스타트업으로 2021년 짐 캘러가 CTO로 부임했고 2023년 CEO로 올라서며 이끌고 있다.

Tenstorrents Wormhole

tensix core 기반
n150, n300 은 PCI 4.x 카드로 기존 X86 아키텍처에서 사용.
고수준 API TT-Buda, 저수준 API TT-Metallium SDK 제공

Rebellions

리벨리온은 데이터센터 맞춘 저전력 특화 및 AI 추론 최적화 시스템 반도체 NPU(Neural Processing Unit)를 개발하고 있다.

FeatureTenstorrent Wormhole n150Tenstorrent Wormhole n300Rebellions Atom [Link]

Architecture	Tensix Cores, RISC-V	Tensix Cores, RISC-V	ION Core, CGRA
Cores	72	128 (64 per ASIC)	Multi-core
Memory	12GB GDDR6	24GB GDDR6	16GB GDDR6
Memory Bandwidth	288 GB/sec	576 GB/sec	256 GB/sec
SRAM	108MB	192MB (96MB per ASIC)	64MB
Performance (FP8)	262 TFLOPs	466 TFLOPs	32 TFLOPs (FP16)
Power Consumption	Up to 160W	Up to 300W	60-150W
Interface	PCI Express 4.0 x16	PCI Express 4.0 x16	PCIe Gen5
Cooling	Passive (Active Kit incl.)	Passive (Active Kit incl.)	Configurable
Price	$999	$1,399	Not specified

NVIDIA H100 Tensor Core GPU [Link]

Architecture	NVIDIA Hopper
Cores	16896 CUDA Cores
Tensor Cores	528
Memory	80GB HBM2e
Memory Bandwidth	2 TB/sec
Performance (FP64)	60 TFLOPs
Performance (FP32)	120 TFLOPs
Performance (FP16)	480 TFLOPs
Performance (INT8)	960 TOPs
NVLink	900 GB/sec
Power Consumption	700W
Interface	PCI Express 5.0
Cooling	Active
Price	Not specified

728x90

저작자표시 비영리 동일조건 (새창열림)

'Industries' 카테고리의 다른 글

[ETF] 2024 라니냐 관련 ETF (2)	2024.10.03
부동산PF- 24년상반기, 저축은 연체율 8.36%, 새마을금고 연체율 11.15% (4)	2024.08.31
반려동물 산업 현황 (0)	2024.07.06
달러-엔 상단은 170엔으로 상향되고 있다. (0)	2024.06.27
[한경 Money, 2024/6 요약2] 데이터센터리츠 (0)	2024.06.19

댓글

티스토리툴바