Technologique (@technologique) — Telegram-канал

The best local LLM inference setup:4x Mac Studio (M3 Ultra, 512 GB of unified RAM), 2 TB of UMA RAM with RDMAEXO 1.0 tooling for clustering, now with tensor parallelism enabled!RDMA (Remote Direct Memory Access) though Thunderbolt 5 - clustering bottleneck eliminatedMLX inference acceleration (now with RDMA support!)And... Mac OS 26.2https://www.youtube.com/watch?v=A0onppIyHEg&t=3m10sDeepSeek v3.2 8 bit quantization (original training quantization) at 25 tokens per second! Wow!516 Watts at the peak of power usage!Downside: cost of 50K USD for hardware. Still better than one or several H100/H200/B200 with limited non unified discrete memory architecture! =)And such setup will work for way much cheaper Mac Minis (no RDMA yet and Thunderbolt 5, but will be added to new generations of M chips, now available in M4 Pro and higher and M3 Ultra)!Apple way ahead of all again!In couple of years this will be a common consumer setup for local LLM inference, using conventional hardware, an APUs from AMD and Intel+NVidia (with integrated CPU+GPU NVLink bus - an upcoming APU architecture), while Apple and NVidia will use Intel Fabs and TSMC fabrication.The enclaves/TEE for hardware memory encryption will be part of such setups for confidential computing over confidential sensitive data.#CPU#GPU#LLM#TEE

20 дек. 2025 г.235В Telegram

Python 3.14 https://blog.miguelgrinberg.com/post/python-3-14-is-here-how-fast-is-it In short - new Python 3.14 it's awesome! Worth to update immediately! 3.14 is way much better in performance than any previous versions, has optionally enabled JIT (doesn't…NoGIL is definitely a huge leap forward!From 3.13 GIL can be disabled... but for this we need customly build interpreter from sources. That's the point should be refined.Cause not every main Linux distro now provide prebuilt packages, only Fedora (python3.14-freethreading package), OpenSUSE (python314-nogil package), Ubuntu (python3.14-nogil package through external PPA) and Nix (python314FreeThreading package), in Gentoo via own ebuild, or in Arch via own pkgbuild script.This will provide python3.14t with NoGIL enabled by default, and we can enable GIL with PYTHON_GIL environment variable or the command-line option -X gil for CPython.But... free-threaded CPython build is not thread safe!Thread safety, i.e. managing shared mutable state for simultaneous threads, using locks, mutexes and other synchronization primitives - are fully on developer. Python code is thread safe. But CLang code (via FFI) and Python interpreter code itself, that written in CLang, can allow access to the same memory, for pointers in several threads, lead to data race and deadlocks. Also can lead to dead/hanging objects in memory and thus memory leaks in long uptimes.And this will affect run-time and revealed only in run-time.(While in Rust for example pointers/references are typed and type-safe, thus allocations/deallocations, objects lifetimes tracking, pointers/references to same data and memory regions, are tracked in compile time, via move semantics, which completely prevents dangling pointers.)Thus memory sanitizers and threads sanitizers should be used for free-threaded CPython. And not all main/core libraries in PyPI now support free-threading.https://docs.python.org/3/howto/free-threading-python.htmlhttps://py-free-threading.git

11 окт. 2025 г.278В Telegram

And there are even more comprehensive continuous benchmarking from TechEmpower, which measure performance for frameworks and libraries in different languages and ecosystems (JSON serialization, web requests/responses, DB requests and updates, etc.):https://tfb-status.techempower.com/https://www.techempower.com/benchmarks/#section=data-r23&a=2&test=updatehttps://tfb-status.techempower.com/results/d27544b6-7365-4269-a4d4-f908f0d21a3ehttps://www.techempower.com/benchmarks/#section=test&runid=d27544b6-7365-4269-a4d4-f908f0d21a3e&a=2&test=update#benchmark#benchmarks#benchmarking#TechEmpower

11 окт. 2025 г.203В Telegram

Python 3.14https://blog.miguelgrinberg.com/post/python-3-14-is-here-how-fast-is-itIn short - new Python 3.14 it's awesome! Worth to update immediately!3.14 is way much better in performance than any previous versions, has optionally enabled JIT (doesn't give too much performance boost, due to the too much dynamic nature of Python and vibrant run-time objects lifetimes) and optionally disabled GIL for multi-threading (installed as separately compiled binary in a system).But PyPy JIT still outperform CPython.Much love for Python anyways! 🙌 Python is a cross-system glue now!Comparison with Rust is just for fun here - Python always will be much more slower, due to the dynamic types dispatch through vtables. And due to the dynamic nature Python always will allow run-time unexpected behavior and run-time crashes (thus should be covered thoroughly with tests for everything), while Rust is fully static (even Dyn trait impls checked by compiler in compile time) and fully type safe (in compile time, before running).There are also more consistent benchmarking test suite across languages:https://benchmarksgame-team.pages.debian.net/benchmarksgame/box-plot-summary-charts.html(They should update Python environment soon and we'll see 3.14 results - now 3,13 used.)#Python#Rust

11 окт. 2025 г.181В Telegram

And the full speech of Geoffrey Hinton about AI anxiety, risks and warning to Humanity:https://www.youtube.com/watch?v=IkdziSLYzHw#AI#AGI

25 июл. 2025 г.273В Telegram

AI anxietyhttps://youtu.be/odUjxJy0YMoHere's Geoffrey Hinton talking about the risks...In fact, he defined and described the risks as a warning to Humanity, and the risks are as follows:Access inequality to general artificial intelligence, i.e. AGI, is the most powerful of its forms, based on various specialized agents/models that interact with each other. OpenAI GPT4o, GPT4.1, o1, o3 and o4, GPT4.5 - are such models (DeepSeek R1 as well). This means that only corporations will have access to such intelligence, but not people and the community.Since proprietary models are closed, the community is offered a closed restricted model.Only the corporation and partially the state have a full model.And AI is actually the Fourth Industrial Revolution - it significantly increases labor productivity, due to very high-level automation.Those who have access to it are both competitive and more efficient.(Our startup, Sentient OpenAGI, is eager to solve this problem of unequal access to AI and create a platform that will contribute to the development of community-driven open AGI, based on decentralized web3 technologies.)And there are risks of bad actors - like developing viruses and bio-weapons. Genetic selective weapons, etc. I.e. the conversion between the protein structure of virion shell and its cell receptors to RNA or DNA sequence of nucleotides (nucleic acid bases) is the task that already solved by neural networks, as it is mostly a combinatorial task.This is not a joke or a fantasy anymore! All these are already existing technologies.#AI#AGI

25 июл. 2025 г.235В Telegram

The data storage engine projects we're all waiting for!I was expecting data storage engines and data warehouse solutions, cloud native solutions for data lakes, will be made using Rust, as systems language, in Rust community.Long awaited stuff, for the whole time since 2015, stabilized Rust v1.0 compiler and Rust 2015 standard.https://github.com/RustFS/RustFS#Rust#RustLang#RustFS

17 июл. 2025 г.241В Telegram

The one technically great web calls service, written in Rust, using Actix and NATS:https://videocall.rshttps://app.videocall.rshttps://github.com/security-union/videocall-rs

16 июл. 2025 г.282В Telegram

AI is dangerously centralized.Why building community aligned AI is really matter, and how web3 technologies can play the key role to resolving current situation with centralized AI, owned by tech giant companies, and instead help to create a community driven ecosystem for AI development.https://x.com/oleg_golev/status/1944157582144246077The podcast:https://x.com/autonolas/status/1926675599172452539#AI#AGI#OpenAGI

13 июл. 2025 г.286В Telegram

AI and AGI should be fully open sourced and loyal to builders and community! The most important thing I should say and add to Steve's blog post is that AI should be open (now we see opposite things - a big tech concentrated AI market), free (as in freedom)…Open, Monetizable, Loyal AGI Platformhttps://www.sentient.xyz#AI#AGI#OpenAGI

15 июн. 2025 г.327В Telegram

AI and AGI should be fully open sourced and loyal to builders and community!The most important thing I should say and add to Steve's blog post is that AI should be open (now we see opposite things - a big tech concentrated AI market), free (as in freedom), monetizable and loyal, for creators/builders/developers good and for community win. And this is OML principle. And target goal of Sentient Foundation, who makes truly open AGI future, and already developed Dobby model (and Dobby is already free! =), Sentient Chat, Sentient OpenDeepSearch, OML Fingerprinting library, Agent Framework and Enclaves Framework (proud to be a leading part of it!).And all of these parts of groundbreaking product portfolio and breakthroughs are made just within less than a year!More good things to come! Stay turned!https://steveklabnik.com/writing/i-am-disappointed-in-the-ai-discourse/https://www.sentient.xyz#AI#AGI#OpenAGI

15 июн. 2025 г.296В Telegram

Whoa! We need to update our kernels!https://hoefler.dev/articles/vsock.htmlhttps://security-tracker.debian.org/tracker/CVE-2025-21756#kernel#Linux#VSock

1 мая 2025 г.325В Telegram

Amazing things has been released by Modular development team (Mojo language and Max inference backend): https://www.modular.com/blog/max-25-2-unleash-the-power-of-your-h200s-without-cuda #Mojo #MAX #AI #AGIModular provides MAX platform - it is MAX inference backend (engine) and MAX inference server (MAX Serve).Just look at this:https://builds.modular.com/models/DeepSeek-R1-Distill-Llama/8B-Q6_Khttps://builds.modular.com/models/Llama-3.3-Instruct/70B?tab=deployIn terms of deployment it is fantastic! Just one (relatively) tiny container!And in terms of programming - GPU programming and acceleration without CUDA, using Mojo language (statically LLVM compiled), which has capabilities of Rust (static memory safety), LLVM MLIR (Multi-Level Intermediate Representation) byte code compilation for amazing low level code optimization and acceleration, syntax of Python and Mojo integrates (embrace) the whole Python ecosystem. I'm playing with Mojo for quite a while already (and it is best of both worlds - Rust and Python), but MAX just used recently. And Llama.cpp not even in comparison with MAX!#Mojo#MAX#AI#AGI

1 апр. 2025 г.434В Telegram

Amazing things has been released by Modular development team (Mojo language and Max inference backend):https://www.modular.com/blog/max-25-2-unleash-the-power-of-your-h200s-without-cuda#Mojo#MAX#AI#AGI

1 апр. 2025 г.256В Telegram

https://www.youtube.com/live/AyH7zoP-JOgGreat conversation!The privacy and confidentiality should be a fundamental human right in the information and ubiquitous computations era.Always think about how your data will be used, what you say, message and what you'll prompt to search engine or AI model, how it can be and will be used, especially against your interests.#AI#AGI#privacy#confidentiality#confidential_computing#CC#security

19 мар. 2025 г.291В Telegram

Technologique

Похожие каналы

Последние посты