Go to Home Page GuidesHow to ArticlesReviewsForumsFrequently Asked QuestionsNewsLinksPotpourri

Site Search


A Guide to The "New" AMD Socket A
Athlon Processor

Last updated: 6/11/00

A quote from the AMD Athlon™ Processor Architecture:

"AMD Athlon processors feature the industry’s first seventh-generation x86 microarchitecture, which is designed to support the growing processor and system bandwidth requirements of emerging software, graphics, I/O, and memory technologies. The high-speed execution core includes multiple x86 instruction decoders, a dual-ported large 128KB split-L1 cache, a large 256KB [Socket A] or 512KB cache [OEM Slot A version of the Thunderbird], three independent integer pipelines, three address calculation pipelines, and the x86 industry’s first superscalar, fully pipelined, out-of-order, three-way floating point engine. The floating point engine is capable of delivering 4.0 Gflops of single-precision and more than 2.0 Gflops of double-precision floating point results at 1000 MHz for superior performance on numerically complex applications."

This statement is a much easier one to understand than the mumble jumbo that was reported in my Guide to the "Classic" Athlon?  However, I will attempt to describe it in briefer and more understandable language and restrict it to the Socket A version of the CPU (the above block diagram is the best one I can find).

The Athlon is a seventh generation X86 processor.  That means it will execute CPU instructions written for the X86 series of CPU's and it is architecturally (and factually) a more powerful chip than it's predecessors--there is more under the hood.

The first X86 generation was the Intel 8086 (and 8088) introduced circa 1978, followed by the 80286, 80386, 80486, "80586" (Pentium, K5, etc.), "80686" (Pentium II, III, K6, K6-2, K6-3, etc.), and "80786" (Athlon).  The 8086 had 29,000 transistors; the Pentium II has 7.5 million, and the Athlon has 22 million.  In short, the "New" Athlon is the second seventh generation X86 processor.

The Athlon has three X86 instruction decoders.  An instruction set is a processor's language.  An instruction tells the processor what data to operate on and what to do with it.  An X86 instruction varies in length from one to 15 bytes.  A byte is eight bits; a bit is logical one or zero.  A logical one or zero is represented respectively by two voltage levels or the two states of a transistorized electronic switch (on or off like a light switch; a 1 or 0 in the binary number system, which, in turn, can be used to represent characters, decimal numbers, instructions, etc.).  The decoders convert X86 instructions into fixed-length MacroOPs, the language of the Athlon.  In short, these decoders decode X86 instructions into Athlon instructions.

The Athlon has an Instruction Control Unit (ICU).  Up to three MacroOPs, Athlon instructions, are sent from the decoders to the ICU per CPU cycle.  The ICU buffers and manages the MacroOPs and sends them the processor's execution unit schedulers.  In short, the ICU is a managed buffer between the decoders and schedulers.

The Athlon has two execution unit schedulers.   There are two MacroOP schedulers.  The first one schedules integer and address calculation MacroOPs. The second schedules MMX, 3DNow!, and X87 MacroOPs.  In short, the Athlon has two execution schedulers which manage the execution pipelines.

The Athlon has nine independent execution pipelines.  There are three 10-stage integer, three address calculation, and three 15-stage, MMX, 3DNow!, and X87 floating-point execution pipelines.  The last three essentially do the floating point number crunching which used to be done by a separate math coprocessor chip (X87) back in the days before the 80486 (and MMX and 3DNow!) and account for a lot of the zip in graphics (games), spreadsheet recalc's, etc.   It is also where the fancy language ("...three-issue, superscalar floating-point capability is based on three pipelined, out-of-order floating-point execution units..") comes into play.  Let's call it "magic" and be done with it. In short, the Athlon has nine execution pipelines which can simultaneously process data.  Three of them are independent floating point units (FPUs), which together can deliver as many as four, 32-bit, single-precision floating-point results in a single CPU clock cycle.

The Athlon has a sophisticated, dynamic branch prediction logic.  'It has a two-way, 2048-entry branch prediction table to store information used to predict the direction of conditional branches. CALL/RET instruction pairs are optimized by storing the return address of each CALL within a nested series of subroutines. A return address is supplied as the predicted target address of the corresponding RET instruction.'  In short, the Athlon has advanced branch prediction logic.

The Athlon implements Enhanced 3DNow!™ 

  • '21 original 3DNow! instructions with superscalar SIMD
  • 19 new instructions to enable improved integer math calculations for speech or video encoding and improved data movement for Internet plug-ins and other streaming applications
  • 5 new DSP instructions to improve soft modem, soft ADSL, Dolby Digital surround sound, and MP3 applications..'

In short, the Athlon can do video, 3D, sound, etc. better.

Next - Cache' >

Copyright, Disclaimer, and Trademark Information Copyright © 1996-2006 Larry F. Byard.  All rights reserved. This material or parts thereof may not be copied, published, put on the Internet, rewritten, or redistributed without explicit, written permission from the author.