I discuss the problem starting with questions, which need to be answered in every case and which can't be answered
without a detailed knowledge of the construction of a computer and the CPU inside. Because of this I only discuss
operating systems for the AT-PC and a type of CPU, which is a i486 or an advanced and similar type. I will call
this type of CPU an "Intel-CPU", because it was constructed at the vendor Intel. But you can get a similar CPU with
the same opcodes made by other vendors too - maybe extended by new opcode for the FPU.
If you want to write an operating system, you need an answer to two questions first:
1st question: How much operating system do you need?
Answer: nearly nothing!
An operating system has to consist of some drivers and in main a memory management, and it must enable the start of
programs. Some initialization has to be done too.
2nd question: Do you need to write an operating system in a "higher level" programming language?
Answer: No!
In every case an operating system has to deal with a certain machine. In every case there is a characteristic opcode
of a CPU inside. This opcode is only fully represented by a certain assembler language. Only this language allows to
program everything and every state.
Opposite to this, a "higher level" programming language is made in every case to program an abstraction and not a
certain CPU. This is called to be "machine independance" and an advantage. But machine independance isn't available
in the real world. Every operating systems therefore needs to be programmed partially in assembler. This part mostly
is despised as "arch dependant".
The question can only be, if a "higher level" programming language is suited to make an operating system better in
any respect.
Again: No!
But this can't be discussed without a look at a real machine, which is here the AT-PC.
"Higher level" languages can't deal with unique features. And most sequences, which are equal in effect, are
differently to program in a certain machine, because there has to be an electronical circuit first, which makes the
logic. To deal with this logic is logical, while higher abstraction isn't logical meaning "rational", because it
forces wrong and longer ways to an effect.
I will tell you now, what this means depending on those parts of an operating system, which I mentioned in the answer
to the first question. The less important thing are the controllers in devices, which can be the same type in different
computers. There a machine independance can be seen. But you are not independant of the machine, which is the controller
itself.
Beside this you need special commands using the Intel-CPU, which can adress in a apart I/O-region. Interrupt lines are
linked to CPU and memory in a particular way too. Even if registers of controllers are mapped to memory-region (using a
PCI configuration space), they have to be adressed peculiar with an Intel-CPU - different i.e.to a Motorola-CPU. Besides
this the cache has to be switched off during access. This slows down the access, so that it is not really an advantage in
every case to get memory mapping.
I will not discuss the construction of an Intel-CPU compared with a Motorola-CPU, because the vendors had for years a race
to as much difference as can be. And both vendors did the one or the other thing very well. But this is forgotten, if
"higher level" programming is used.
To search for the smallest common demoninator is logically stepping back!
The most remarkable idea, build in an Intel-CPU, is the segmentation of memory as given after switching to protected mode.
This method of adressing can't be done by a "higher level" language. But have a look at the most important problem first,
solved by this construction:
Programs need to be translated to opcode using a known base-adress to enable the assembler-program to translate names of
labels to physical adresses. The base-adress can be i.e. the adress =0.
If more than one program in memory should run with with this base-adress, more than one adress =0 must be existant - of
course this can be provided only by a trick. In every CPU this problem is solved by adding a second part of adress to every
adress given by a program. In the Intel-CPU, there are even 3 methods available. While in real mode there is an addend
used, which is an absolute adress divided by 16 (the "segment-adress"), in protected mode there is an absolute base-adress
used, which is part of a descriptor, which is adressed by a selector, which is an offset-adress in the GDT and has to be
added to the base-adress of the GDT for the use of adressing the descriptor. But this seemingly too complicated method is
in fact the most significant advantage of the table GDT.
While the segment-adress in real mode is a near at hand type of a base-adress, used in other CPUs too ( and named
differently), the method of adressing using selectors and descriptors as a part of the construction is an outstanding
achievement of Intel-engineers - too outstanding for imagination of system philosophs.
This is why in an Intel-CPU there is a third method of adressing provided, which makes the use of "higher level"
languages possible. This method is the page mode, which can be switched on only after schwitching on protected mode.
But this is not done only by setting a PE-bit in the control-register cr0. You have to deal with a lot of consequences,
which are not an advantage at all.
For the first, there is to say, that the page mode does not make anything available, which isn't available before
switching it on. The protected mode needed to be extended by a lot of features ( which are available in that mode too),
to make the page mode available. Especially the calculation of physical adresses, needed to adress physical memory
chips, had to be expanded so much, that a second calculator circuit was to append to that one working with selectors and
descriptors and a 32 bit offset-adress.
In the page mode the offset-adress in a page does consist of only 12 bits. The rest of 20 bits are read from two tables,
each one storing 10 of those bits. These tables are the "page directory table" and the "page table". These tables
describe the contents of regularly dimensioned regions, the "page frames", each one framing 4 KByte (=1000h Byte).
Although making two tables should lead to less needed memory, the tables need much more of memory than the always
needed GDT - at least about 1M . And the two tables cause at least to additional memory accesses per adress to make
up the physical adress, and at least three calculations have to be done (the contents of cr3 are part of the
adress-calculation too!). This calculation does not replace the calculation based on the GDT, but extends it !
As this calculation is needed whenever a register in memory is to adress, the average speed of programs is only the
half one. If there are "task switches" needed too, you need not only read a new value for cr3 in memory. You need to
save the old value too in the according TSS (in memory). Much more trouble is caused by swapping.
But this is not the whole disadvantage!
How to define the 20 bits in adresses, which can't be directly defined in the program?
And how must programs look like, which are bigger than a page, but can define adresses only inside this page?
On principle this can be only done by additional program code, which is used during compilation and at runtime too, to
make up actual adressing. This costs time and memory space. But system philosophs had some ideas, which at least where
fitting to make money out of this problem...
Programs do not only adressing in memory, but often have to deal with the I/O-region too, which is to access peculiarly
too in Intel-CPUs. The construction of the I/O-region prevents of too great damage, if an error in a transfer occurs or
an alpha-particle passes through the chips. As there are special commands to use to adress in that region, this is
another thing, which isn't available without additional efforts in "higher level" languages. Because of this,
controllers, which are at least needed for timing, interrupts, keyboard a.s.o... can't be ruled as easy as can be.
Finally there is interrupt handling to do in a way, which can't be done using "higher level" languages. There is an
Interrupt Descriptor Table (="IDT") to use, which starts at an absolute base-adress, adressable only with a fitting
descriptor and to define with a special command (;lidt;).
All these features are touched by the cache and its behaviour, which is only ruled by special commands and bits in the
CPU, where more than one special register is to use to define the state of the machine (i.e.protected mode).
Descriptors contain a lot of bits too, which define states of segments. These bits rule priviledges and sizes of
adresses and operands. Finally a presence-bit it available, which can be used for swapping without paging.
Thus another question is to ask, if you want to write an operating system facing the AT-PC.
3rd question: Do you need all that?
Answer: No!
The engineers at Intel wanted to make a maid of all work and therefore added a lot of features, which do not work
all in one. This causes the next question:
4th question: What do you need and what do you forget?
The answer to this question can be answered in different ways and touches quasi-religious convictions.
In fact you can find easily a rational answer, if you find crites for optimization, which can't be disputed:
speed, smallest need of memory, simple bug-hunting and coding.
You can be apart, if money making or other desires to show off get important too. Finally there are ideas involved,
which partially rule the world of programmers. These ideas I discuss first:
There are two features depending on operating systems, which many programmers do not want to miss, because they
think, that they would miss an advantage. These features are "paging" and especially "multi tasking". Paging does
not make sense, if there isn't swapping to do. This once was an idea to make too big programs run in too little
memory space. This game could only be played without too much efforts using regularly dimensioned small parts, the
pages. But this problem has been solved much more easier in the meanwhile by reducing the size of transistors. More
memory cells could fit into the same chip and thus the costs of memory where reduced too. Paging and swapping became
methods from stoneage and could be of interest only, if you want to save whole movies or program a giant search
machine. Too few and slow transistors on a chip once made CPUs slow and expensive. Because of this "main frames"
where constructed, linked to lots of simple machines, which could only rule a keyboard and a monitor. This caused a
lot of other problems, which needed to be solved by "multi using" and "user identification". But a lot of time and
memory was wasted for the then needed bureaucracy.
This need has gone away on the same way as the need of paging. In fact everybody, who reads this, faces more speed
and memory than the main frames made available in the pentagon 30 years ago. Thus multi tasking is from stoneage too.
But some people think, that this method is magic spell to make two (or more) machines out of one. Those ideas caused
constructions on chips too. Some people invented "cubes" and "supercubes", where more than one CPU was part of a
"cluster". This is cubed bullshit, because the most problems are not scaleable and do not allow to build the roof
before the house. But if you are not convinced, you can get some pins for bus control and cache coherency as feature
of an Intel-CPU, which can be a part of "multi processing". But facing the AT-PC, you can ignore that. To allot tasks
to machines has become quite simple and more rational by advancing controllers - multi tasking without consequences
in operating systems.
There is to say condensed, that paging and multi tasking is not only no advantage, but it is a disadvantage, which
reduces speed of processing and makes the need of much more memory than needed for tables. The last consequence is a
giant waste of memory (more about this below...).
If multi tasking is to do (and not "on demand"), then a clock is needed for timing. This is not available in the AT-PC
without many disadvantages. There is only one timer and one CMOS-clock. You will have to manage the rule of the clock
with a lot of consequences...
Before the rational answer to the last question can be found, it is to discuss, how desires to make money or get
omnipotence make brains muddy. The quasi-religious decision for "higher level" languages and damnation of assembler
will be discussed as last...
For the first the question shall be, if an operating system needs to be a "supervisor". These supervisor operating
systems prevent every program of doing something without a deal with the supervisor - especially I/O and
interrupt-handling. In an Intel-CPU there are a lot of features provided to satisfy desires for omnipotence.
But there is no logical reason to desire this. Nevertheless nobody discussed this. As programs on principle can read
other programs, can disassemble and make them intelligible, this is not a desire, if anyone wants to make money,
sell his program and hide nonsense made expensive. So companies could grow up making billions of dollars, which use
copyright as base of their business as publishers, but sell "books", which nobody is allowed to read.
Not only big business, but other intentions too can make omnipotence desirable. Because of this, there is to state,
that no CPU can sense intentions (nor needs such a sense), if the program counter or an index-register adresses opcode
to make it run. Only the purpose of a program can make important, which opcode is to adress next! A supervisor is
absurd on principle, if there are no intentions to hide something in the operating system. But in fact there are
some things, which are better done inside an operating system. This depends on memory-management and therefore
on starting programs too, which can't be started in protected mode without framing the code inside a program-segment,
which can be only provided by managing the GDT. But this task can be based on some tables and variables, which can be
adressed by programs too. This means, that on principle an operating system can be made, which allows everything to be
done in programs too. The eyes of a supervisor are not only superfluous, but in main disturbing and costing time and
code. I will discuss below, how far system security is touched. But first I look at "higher level" languages and try
to find out, if that is a creation, which is more than some words worth.
Business did not only lead engineers, but designers of "higher level" programming languages too. Although initially
intentions to generalize assembler languages where the origin of languages as "BASIC", "FORTRAN" a.s.o., the intentions
to provide expressions of good feel for programmers soon faded behind intentions to feel good with the money of
programmers, spend for feeling good. And a second feature of those languages was found soon. Programs could be expressed
so that the compiled code could not be disassembled. Important parts of linking sequences could stay hidden in
configuration files, linking- and praeprocessor-programs. The effect could occur at runtime, when invisible - especially
when priviledges in protected mode where used too.
You should be astonished, why these features could be hided behind praising simplification and acceleration of writing
programs. If you consider more than the sensational simplification of an "+" instead an "add", you can find, that the
opposite is true: "higher level" languages do not only cause more binary code but source code too. They cause even
much more code, if you look at every part of a program! The whole program contains that code too, which is included and
defined by praeprocessors! This may be a whole library, compiled and linked with giant programs and efforts.
But even a "+" isn't less work to write than an "add", which is suited to be combined with a complex calculation of
adress of an operand, which isn't available in higher spheres - i.e. ;add eax,[gs:ebx+ecx*4+displ];
Since the present of Richard Stallman (GNU-Compiler für C/C++) and all the other presented programs from the loveable
and adorable fellowmen out of the world of gnus and pinguins, there are not only intentions to despise at work. You
can read the sources - but who reads some Gbytes and when in this life? And do you really need it?
Answer: No!
You might feel, that this is the right answer, when You are still unpacking the big presents. But to articulate the
answer costs some more nerve-racking. You need not only to learn about C/C++, but about UNIX too, which was born in an
other world, 30 years ago. In the meantime more than 100 derivates where coded, which are readable as open source code.
This is enough source to find out in detail, which consequences the "UNIX system philosophy" has and which consequences
never where considered...
An UNIX-system is a generalized type of an operating system, which is inseparable from the "higher level" C/C++.
Most of that UNIX-types are written partly in C/C++ too. Every other in those systems useable "higher level" programming
language is translated first to C/C++ before assembling the code.
The consequence is, that the page mode must be switched on, because C/C++ can't deal with selectors. And you need
a supervisor, because of absolute base-adresses of GDT, IDT and I/O. Finally no call for procedure (="function") can be
done without a giant linking system, which does most of the duty during runtime.
These restrictions are normally compensated with the statement, that this "higher level" of UNIX/C/C++ allows to write
programs, which can run machine independant. That this is not true, is evident, if you read the list of options for
the GNU-compiler. There are not only a lot of C/C++-dialects to distinguish, but linkers and file-systems too. Instead
of machine dependant, you are sytem dependant. The same C-program does not run on the same machine under different
UNIX-type systems. And even machine independance isn't a fact. This is evident after reading driver-sources.
This can't be altered!
But the dreams and desires do not vanish. Nearly every year a next UNIX-like project starts with the intention at
least one fault not to make. The most important reason: Nobody wants to miss those big presents, written in C/C++.
Nobody seems to realize, that the most of the bigness is packing and garbadge.
But the garbadge can only be recognized, if you tried to make a NOT-UNIX-type system. There is no doubt, that the
UNIX-system-philosophy does not allow to get rid of those giant compilers, linkers, praeprocessors, libraries,
bash-scripts, makefiles a.s.o. Because of this, every UNIX-type is the same step back to disadvantage, which is greater,
if multi-tasking is provided. The greatest step back is done with UNIX-types, which are not really purebred as i.e.
that most popular operating system, which changed the name another time and is now able for nearly never ending boot-time.
Booting lasts on laptops about 15 minutes, so that there is nearly nothing else left to do. It is called to be the
"safest" system, but you can only be shure, that the battery is soon unloaded.
This type of system philosophy does not only touch the nerves, but needs some more power stations in the world, because
it rules millions of machines.
I did not think of environment pollution, but my nerves, when I searched the answer to the questions above.
And I was interested only in solutions for some own problems, which could not be solved using window$ or LINUX.
At work I found, that I had to do only little more to solve classes of problems and not only my problem.
Thus I extended my intentions for the features of the operating system, that I wrote and still write on:
1) Everybody shall be able to understand and extend it.
2) The translation to opcode shall be done fast and without relation to file- or linker-systems.
3) Every decision shall be alterable and extentable by programs.
4) Every bit in the AT-PC shall be accessible without the need of code inside the operating system.
5) Every sequence in the operating system shall be replaceable during runtime.
Of course it should be as fast and simple as can be.
This can be done only, if every part is written in assembler. This depends on the user space too.
Assembler is the only language, which enables to really rule everything in a machine. You do not need the knowledge of
at least four programming languages ( C/C++, inline-assembler, praeprozessor, bash...), which are different to write
and to use depending on a target-system. You only need to keep in mind about two dozen commands and rules instead and
will be able to solve 99% of problems. The commands for the last percent can be found in the sourcecode of my ASMn,
ASMnr and ASMat.
Thus my answer to the 4th question can be:
Nothing is forgotten and impossible!
(But forget those C/C++ presents, which are too expensive...)
If You fear, that something becomes more complicated by using assembler language, You are far away from the things to
do (occassionally in an UNIX-world). My answer does not only cause less binary code for the same effect, but less source
code too. And programs for the "user space" are made with less efforts than needed in other operating systems. This
needs to be explained more in detail. I name it...
I mentioned already the problem of providing 0-adresses, which is solved in the AT-PC by segmentation. Different to the
solution in real mode, there are absolute base-adresses of segments a part of descriptors in protected mode, stacked in
the GDT. The descriptors are adressed using selectors, which are mainly offset-adresses in the GDT (bits 0,1,2 define
priviledge levels and destination table GDT or LDT). And I do not switch on the page mode in protected mode, but do not
prevent anyone unteachable to do so...
One consequence is, that procedures need to be called NEAR or FAR, and that every adress outside the actual segment
(defined in cs/ds) has to be explicitely combined with a reference to the target segment. This has to be done by defining
a segment-register containing the appropriate selector. In protected mode (and page mode too!) the bits 3-15 of the
selector are added to the base-adress of the GDT in the GDT-register to adress the descriptor in the GDT.
Neither names can be used in this FAR-case nor absolute adresses!
If you want to use names, an actual assignment of adresses to names is needed. In every operating system (allowing those
names), there is a "linker" (many types of programs are in use), which does linking of programs to each other (if needed)
or to the kernel. LINUX makes about 30000 of those references during boot time. This isn't astonishing, because there are
no few fitting segments used, but much more pages.
This is only needed, if you want names!
As the kernel is written before any program, which uses its service, and as every service in further programs is written
before the programs, which use it, every offset-adress assignment to names can be totally derived from source code.
An actual assignment is superfluos on principle! And a lot of other assignments to filenames or configuration
tables need not to be done in every initialisation, but can be done in an installation process!
The normally needed "parameters" when calling a procedure, can be provided much more simple too, if they are defined
in registers. This can't be done without using assembler language. Providing more data than fitting into registers, can
be done by defining base-adress and length in registers - it needs not to be done using the stack, which else can be
smaller and increases more calculatable.
The prohibited use of registers, done only implicitely by "higher level" programming languages, isn't an advantage but a
great disadvantage, because it causes a lot of tranfers, which can be avoided by the use of an assembler language. You
can save a value in a register during whole sequences and you can avoid variables too by using immediate values in
registers. Finally you can use the adress calculation circuit in the CPU, which can be made useable by compilers only
in extraordinary cases. And only, if page mode is not on, the adress calculation in the CPU is easy enough available.
But there is another problem to solve. I discuss it starting with the fact, that every program looks like this:
1.adress 1.command
...........initialisation
X.adress 1.command in an infinite loop containg branches to execute input or exit program
...........procedures, called from initialisation or the loop
...........variables and constants
The sequence of the dotted parts could be changed too.
This program can be the only one in a machine. But to fit for every purpose, it needs to be very big. As memory space
wasn't available for such a program in the early days of computer evaluation, methods needed to be found to split the
one big program to a lot of parts.
If the parts should be positioned anywhere in memory, there is not only the need to find a part, but there is the need
too, to recognize, that the part is a certain one. Therefore programs need to get individuality. This means, that the
programs A,B,C, which should run as one program, need not only the base-adress of A, but a crite too, that this is the
base-adress of A and not C. The known way to solve the problem, is to name every program using filenames. But a digit
as a name would be enough, because only a program needs the name! The more signs are used for names, or if even
"path-names" are used, the more time is needed for searching and the more memory is wasted for the searching program.
An ideal choice is a 64-bit-name, which can be adressed with the same offsetadress as contained in a selector, and
a second base-adress for a second table of names. The name to search can be made of 32 bits, which easily can be
selected by a ;cmp;-command. And the other 32 bit can define types of contents of the segment. You do not need a
linker-program anymore. Every program, which can read the GDT can easily find every other code needed at runtime and
anywhere in memory. I name the second table in the following text "Global Nickname Table" (="GNT") and the
64-bit-names "nickname".
As no linker is used, every program can be a "standalone"-program, starting at adress =0 of a program-segment.
But as you need information about the labels adresses at certain adresses, if other programs should use them, the
following structure is mandatory:
As I mentioned above, there are names needed to make programs relocatable (starting at any base-adress). Only a name
enables a program to find an actual segment and its base-adress. These names are 32 bits, distinguishing up to
4G programs.
In order to use the offset-adress contained in the selector without arithmetic, I added an attribute to the name.
This attribute is well suited to decide in the same run, if the contents of the named segment are program or data,
editable or not a.s.o...
And the attribute is well suited too, to point to an entry in other stacks, which contain those strings, the user
really needs: filenames of editable data or executable programs.
This method makes the giant bureaucracy namely in UNIX-system superfluous. You need nor a "process management" neither
a linker for installation or initialisation purpose. If you want to install a program assigned to other programs or
data, you will need nearly no more signs than those %include... and %define... statements in C/C++ programs. And you
will need some certain calls and jumps, which can be copied and pasted out of my example code.
There is no other structure needed besides an installation directory, where names of files are assigned to nicknames!
If you once tried the painful investigation of an UNIX-bureaucracy, you can imagine how much G-byte-garbadge gets out
of the world by the use of nicknames. And if you ever wondered about up to ten times more path-names and "short"-cuts
than binary opcodes in an ELF-file, you can imagine how much memory is available, if this garbadge is changed to
nicknames.
One question is left: How to integrate drivers into the kernel binary?
The question to answer before is, how a driver differs from a program - as a driver is a program too because of an own
segment. In ASMOS only those programs are called "drivers", which are installed in the same binary containing ASMOS
or an other program and which are initialized by that programs.
Some decisions are needed to distinguish drivers and programs, which can do exactly the same, but are started
separately (You could call those programs to be a "module", but I will call them programs anyway, because there is
no need to distinguish types).
Normally programs are called "drivers", if they deal with devices.
But in ASMOS "drivers" are only those programs, which are only initialized, if you jump to the first adress. Execution
then is done jumping to other adresses or calling procedures or using data. Those programs become a part of the
assigned and initializing programs (especially ASMOS). These drivers are initialized calling them FAR by rule.
The program is already a complete operating system ( not only a kernel...). You can do editing and manage the machine,
the memory, filesystems a.s.o. using a menu-mode. Every task is branched out of a key-loop, which can be replaced by
an other loop too. Inside this loop there are three loops encapsulated, which make tasks available, which can be ruled
in other systems only by multi tasking. There is an editing mode, which can exist in parts of the screen too (windows),
there is a copy-mode for making i.e. copies of binary code too, and there is the menu-mode, where i.e. programs are
started - read the file "ASMOSdoc" for the details of the available sequences. Compared with M$DOS or derivates, this
is far out and more like a LINUX without Xserver (only Text-mode) including a C-standard-library, a file manager
"mc", an editor "vi" and a command interpreter "bash".
While the whole ASMOS consists of 130000 byte binary, only the "vi" consists of about 1M byte ( with less features
than the already completed 2 edit-modes inside ASMOS, which consist of about 20K byte). Besides this there are
procedures in ASMOS available, which avail much more than a C-standard-library and allow easiest linking of programs
to the menu-mode and use of the keys for I/O.
But nearly every part of ASMOS can be re-defined, because branching to sequences and procedures is mostly done using
FAR-pointers, stored at global adresses. Thus a driver (or a program) acts in its initialisation re-defining these
pointers and replacing or extending parts of the kernel. Of course, replacements need to act aequivalent to the
replaced parts - set certain statusbits, define certain variables and read and write certain parameters, but can
append new branches to the menues too.
Finally a driver can replace only a table and could vanish during installation, if there is nothing else to do.
The installation program (in FDOS2) makes a single binary out of ASMOS and drivers (and some directories), which is
bootet as one piece. Touching rituals at boot time with flags and clouds or unreadable text are superfluous -
ASMOS boots in less than 5 seconds into an editor with a menu-mode replacing a "prompt". Most of the boot-time is
needed for a very thorough memory check.
The installation of drivers (and tables) looks like that in every case:
ASMOS contains a special adress at "programend:". It can be read at adress =8 in the kernel binary. The
installation-program writes the amount of following base-adresses of driver-binaries to this adress.
When i.e. a single driver is appended, there will be written the value of 1, followed by the offsetadress in the
kernel-segment, where the 1st opcode of the driver can be found. This looks like that: