An assembly (or assembler) language, often abbreviated asm, is a low-level programming language for a computer, or other programmable device, in which there is a very strong (generally one-to-one) correspondence between the language and the architecture's machine code instructions. Each assembly language is specific to a particular computer architecture. In contrast, most high-level programming languages are generally portable across multiple architectures but require interpreting or compiling. Assembly language may also be called symbolic machine code.
You'll hear it said: "I thought assembly was a dead language, why waste my time?" Well, you might not find yourself writing your next app in assembly, but there is still much to gain from learning assembly. Today, assembly language is used primarily for direct hardware manipulation, access to specialized processor instructions, or to address critical performance issues. Typical uses are device drivers, low-level embedded systems, and real-time systems
Scroll down as well for a little tutorial using debug (MSDOS), but here are some other useful sites you may want to check out:
Assembly languages, and the use of the word assembly, date to the introduction of the stored-program computer. The Electronic Delay Storage Automatic Calculator (EDSAC) had an assembler called initial orders featuring one-letter mnemonics in 1949. SOAP (Symbolic Optimal Assembly Program) was an assembly language for the IBM 650 computer written by Stan Poley in 1955.
Assembly languages eliminate much of the error-prone, tedious, and time-consuming first-generation programming needed with the earliest computers, freeing programmers from tedium such as remembering numeric codes and calculating addresses. They were once widely used for all sorts of programming. However, by the 1980s (1990s on microcomputers), their use had largely been supplanted by higher-level languages, in the search for improved programming productivity. Today assembly language is still used for direct hardware manipulation, access to specialized processor instructions, or to address critical performance issues. Typical uses are device drivers, low-level embedded systems, and real-time systems.
Assembly language has long been the primary development language for many popular home computers of the 1980s and 1990s (such as the MSX, Sinclair ZX Spectrum, Commodore 64, Commodore Amiga, and Atari ST). This was in large part because interpreted BASIC dialects on these systems offered insufficient execution speed, as well as insufficient facilities to take full advantage of the available hardware on these systems. Some systems even have an integrated development environment (IDE) with highly advanced debugging and macro facilities.
There have always been debates over the usefulness and performance of assembly language relative to high-level languages. Assembly language has specific niche uses where it is important; see below. In the TIOBE index of programming language popularity, it is currently (October 2016) at rank 11, ahead of, for example, Swift and Ruby. Assembler can be used to optimize for speed or optimize for size. In the case of speed optimization, modern optimizing compilers are claimedto render high-level languages into code that can run as fast as hand-written assembly, despite the counter-examples that can be found. The complexity of modern processors and memory sub-systems makes effective optimization increasingly difficult for compilers, as well as assembly programmers. Moreover, increasing processor performance has meant that most CPUs sit idle most of the time, with delays caused by predictable bottlenecks such as cache misses, I/O operations and paging. This has made raw code execution speed a non-issue for many programmers.
There are some situations in which developers might choose to use assembly language - here are a few:
• A stand-alone executable of compact size is required that must execute without recourse to the run-time components or libraries associated with a high-level language; this is perhaps the most common situation. For example, firmware for telephones, automobile fuel and ignition systems, air-conditioning control systems, security systems, and sensors.
• Code that must interact directly with the hardware, for example in device drivers and interrupt handlers.
• In an embedded processor or DSP, high-repetition interrupts require the shortest number of cycles per interrupt, such as an interrupt that occurs 1000 or 10000 times a second.
• Programs that need to use processor-specific instructions not implemented in a compiler. A common example is the bitwise rotation instruction at the core of many encryption algorithms, as well as querying the parity of a byte or the 4-bit carry of an addition.
• Video games (also termed ROM hacking), which is possible via several methods. The most widely employed method is altering program code at the assembly language level.
• Self-modifying code, to which assembly language lends itself well.
• Games and other software for graphing calculators.
Assembly language is still taught in most computer science and electronic engineering programs. Although few programmers today regularly work with assembly language as a tool, the underlying concepts remain very important. Such fundamental topics as binary arithmetic, memory allocation, stack processing, character set encoding, interrupt processing, and compiler design would be hard to study in detail without a grasp of how a computer operates at the hardware level.
So, if you've decided to learn assembly language, you'll need an environment to do so. One of the most primitive ways is probably to use Debug (debug.com) which you can find in any machine that includes MS-DOS. It's not the easiest thing in the world to do, but what it does do is bring you closer to the processor and that can be rather satisfying once you get in to it!
So let's look at how to use debug:
You can use letters like a,n,l,d,t,g to give directions to debug to write a program - you can do things like save, load, and so on.
To start writing the code, we write ' a ' - this basically just means: 'assemble '. Do that, and press Enter.
After writing the line, press enter twice and you'll get the following:
To run the program, type ' g ', which means ' go ', and press enter. The program starts to run and ends with the first break. (Use int 3 (break) or int 20 (return to DOS) instructions otherwise the computer is locked. If you want to save the program int 3 will not be enough not to lock the computer. Use Int 20 there. And you have to be careful about DS. DS should have the previous value after your program finishes. So you should use 'push ds' and 'pop ds'.)
To return to DOS from debug environment, write 'q' and quit.
So, now that you know some basics, let's write a simple program - a program that adds two numbers. We are going to use AX and BX registers and the sum will be in AX. (AX=AX+BX)
What's happening here? Well we are using '«' instead of pressing enter ....
- C:\> debug «
-a « ; start to write code
mov ax,03 « ; pass AX 03, AX will be 03
mov bx,04 « ; pass BX 03, AX will be 04
add ax,bx « ; add AX with BX, the sum is in AX
int 3 « ; break
-g « ;run, that is, go to 0100 because IP is 0100 (IP=Instruction Pointer, the code at the address it shows is executed)
And the result is directly shown by debug just after 'g « ". So, we see the sum ,07, is in AX.
You can use 't' to trace and see step by step what the program does, that is, it allows you to follow the value of registers. At this point however, the aforementioned usage will not work and for that matter 'g' will not run the program either. This is because of the value of IP. Check it out: It is 0108. But our program starts with 0100. All you need to do is change the value of the IP to 0100. You can do that using 'r'. If you type 'r' and press enter, you will see the value of registers - cool eh? If you want to pass a value to a register, you type ' r ax' or 'r bx' or 'r ip' ...you get the idea, and press enter.
Check out the new value of IP now. Now, when you say 'go' the processor executes what is in 0100, which is mov ax,03 here, and increases the value of IP until a break occurs.
Up until this point, we used a, g, t, r, q instructions and wrote a simple program. Now, let's see how to save a program. You are going to see how entirely different it is as compared to saving in other programming languages!
Assuming this is where we are: There are now three stages that need to be implemented in order to save this program.
Here the order of passing a value to CX register and naming the program is not terribly crucial. Truth is, sometimes you don't even need to specify the value of CX. Because you know it has the convenient value. For example, if you return to DOS and again enter debug, the value of CX will not be changed. There is one other thing I need to draw your attention to: The value of IP. It should be 0100 here because it is the start point of the saving process. For example, if we tried to save this program after running it, the value of IP would be 0109 and we would get a message like 'writing 40009 bytes', maybe we would think we saved it but we wouldn't be able to load the program later.
Here you see how many bytes have been saved. Let's try to load it.
It didn't work.
You may have spotted the mistake! I tried to explain something to you in terms of things I didn't mention before. Before telling you how to load a program, I used it to express something. Sorry about that. I don't think here it will be a big trouble. Because it is a simple to figure out stage. But in classes there can be lots of those kind of problems, so be careful. Let's turn to our subject now.
There are two ways. One is, as we see above, writing debug and the name of the program and press enter. Then write 'u' (unassemble) to see the code.
You see, with ' u ' debug shows us the code with op-codes. So for example, we can easily say that B8 is the operation code of mov ax,...
The second way of loading a program is using 'l' (load). Let's see how.
We write 'n' and the name of the file which will be loaded. Then we write just ' l '. And the program is loaded. If it is wanted to see the code, u can be used.
Suppose that we want to replace the line with ' ADD AX,BX' with 'SUB AX,BX'. You see the address of that line. It is 0106.
Here to be able to show you clearly I quit debug and again enter. Then I loaded first.com. But I didn't use u (unassemble) not to turn environment into a mess. I know that just at this point in the line 106 there is ADD AX,BX. So, to replace it, we write a 106 and press enter. Then line 106 comes in front of us empty. We write what we planned to write and press enter two times. And that is it. It's changed.
You'll notice something interesting about the situation of CX. Notice that a value was not passed to CX. Yet, writing was completed successfully because we knew the value of CX. It was 9 which was specified when loading the program and which never changed during all the processes done until now!
Adapted from source(s): Wikipedia - Creative Commons; Plug for: http://www.codeproject.com/Articles/37762/How-To-Use-Debug