This patch is derived from Robert Love's 2.4.21-pre1-1 premption patch at ftp://ftp.kernel.org/pub/linux//kernel/people/rml/preempt-kernel/v2.4/preempt-kernel-rml-2.4.21-pre1-1.patch a) strip all arch specific part from the above RML patch b) apply the rest to linux-mips.org CVS tree linux_2_4 branch on 030304 Hand fix a couple of conflicts c) re-create the patch, which gives the following Jun diff -Nru linux/Documentation/Configure.help.orig linux/Documentation/Configure.help --- linux/Documentation/Configure.help.orig Tue Mar 4 12:21:51 2003 +++ linux/Documentation/Configure.help Tue Mar 4 13:38:21 2003 @@ -287,6 +287,17 @@ If you have a system with several CPUs, you do not need to say Y here: the local APIC will be used automatically. +Preemptible Kernel +CONFIG_PREEMPT + This option reduces the latency of the kernel when reacting to + real-time or interactive events by allowing a low priority process to + be preempted even if it is in kernel mode executing a system call. + This allows applications to run more reliably even when the system is + under load. + + Say Y here if you are building a kernel for a desktop, embedded or + real-time system. Say N if you are unsure. + Kernel math emulation CONFIG_MATH_EMULATION Linux can emulate a math coprocessor (used for floating point diff -Nru linux/Documentation/preempt-locking.txt.orig linux/Documentation/preempt-locking.txt --- linux/Documentation/preempt-locking.txt.orig Tue Mar 4 13:38:21 2003 +++ linux/Documentation/preempt-locking.txt Tue Mar 4 13:38:21 2003 @@ -0,0 +1,104 @@ + Proper Locking Under a Preemptible Kernel: + Keeping Kernel Code Preempt-Safe + Robert Love + Last Updated: 22 Jan 2002 + + +INTRODUCTION + + +A preemptible kernel creates new locking issues. The issues are the same as +those under SMP: concurrency and reentrancy. Thankfully, the Linux preemptible +kernel model leverages existing SMP locking mechanisms. Thus, the kernel +requires explicit additional locking for very few additional situations. + +This document is for all kernel hackers. Developing code in the kernel +requires protecting these situations. + + +RULE #1: Per-CPU data structures need explicit protection + + +Two similar problems arise. An example code snippet: + + struct this_needs_locking tux[NR_CPUS]; + tux[smp_processor_id()] = some_value; + /* task is preempted here... */ + something = tux[smp_processor_id()]; + +First, since the data is per-CPU, it may not have explicit SMP locking, but +require it otherwise. Second, when a preempted task is finally rescheduled, +the previous value of smp_processor_id may not equal the current. You must +protect these situations by disabling preemption around them. + + +RULE #2: CPU state must be protected. + + +Under preemption, the state of the CPU must be protected. This is arch- +dependent, but includes CPU structures and state not preserved over a context +switch. For example, on x86, entering and exiting FPU mode is now a critical +section that must occur while preemption is disabled. Think what would happen +if the kernel is executing a floating-point instruction and is then preempted. +Remember, the kernel does not save FPU state except for user tasks. Therefore, +upon preemption, the FPU registers will be sold to the lowest bidder. Thus, +preemption must be disabled around such regions. + +Note, some FPU functions are already explicitly preempt safe. For example, +kernel_fpu_begin and kernel_fpu_end will disable and enable preemption. +However, math_state_restore must be called with preemption disabled. + + +RULE #3: Lock acquire and release must be performed by same task + + +A lock acquired in one task must be released by the same task. This +means you can't do oddball things like acquire a lock and go off to +play while another task releases it. If you want to do something +like this, acquire and release the task in the same code path and +have the caller wait on an event by the other task. + + +SOLUTION + + +Data protection under preemption is achieved by disabling preemption for the +duration of the critical region. + +preempt_enable() decrement the preempt counter +preempt_disable() increment the preempt counter +preempt_enable_no_resched() decrement, but do not immediately preempt +preempt_get_count() return the preempt counter + +The functions are nestable. In other words, you can call preempt_disable +n-times in a code path, and preemption will not be reenabled until the n-th +call to preempt_enable. The preempt statements define to nothing if +preemption is not enabled. + +Note that you do not need to explicitly prevent preemption if you are holding +any locks or interrupts are disabled, since preemption is implicitly disabled +in those cases. + +Example: + + cpucache_t *cc; /* this is per-CPU */ + preempt_disable(); + cc = cc_data(searchp); + if (cc && cc->avail) { + __free_block(searchp, cc_entry(cc), cc->avail); + cc->avail = 0; + } + preempt_enable(); + return 0; + +Notice how the preemption statements must encompass every reference of the +critical variables. Another example: + + int buf[NR_CPUS]; + set_cpu_val(buf); + if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n"); + spin_lock(&buf_lock); + /* ... */ + +This code is not preempt-safe, but see how easily we can fix it by simply +moving the spin_lock up two lines. diff -Nru linux/drivers/ieee1394/csr.c.orig linux/drivers/ieee1394/csr.c --- linux/drivers/ieee1394/csr.c.orig Tue Mar 4 12:18:26 2003 +++ linux/drivers/ieee1394/csr.c Tue Mar 4 13:39:25 2003 @@ -19,6 +19,7 @@ #include #include /* needed for MODULE_PARM */ +#include #include "ieee1394_types.h" #include "hosts.h" diff -Nru linux/drivers/sound/sound_core.c.orig linux/drivers/sound/sound_core.c --- linux/drivers/sound/sound_core.c.orig Mon Nov 5 12:16:12 2001 +++ linux/drivers/sound/sound_core.c Tue Mar 4 13:38:21 2003 @@ -37,6 +37,7 @@ #include #include #include +#include #include #include #include diff -Nru linux/fs/adfs/map.c.orig linux/fs/adfs/map.c --- linux/fs/adfs/map.c.orig Mon Nov 5 23:56:09 2001 +++ linux/fs/adfs/map.c Tue Mar 4 13:38:21 2003 @@ -12,6 +12,7 @@ #include #include #include +#include #include "adfs.h" diff -Nru linux/fs/fat/cache.c.orig linux/fs/fat/cache.c --- linux/fs/fat/cache.c.orig Mon Nov 5 16:55:22 2001 +++ linux/fs/fat/cache.c Tue Mar 4 13:38:21 2003 @@ -14,6 +14,7 @@ #include #include #include +#include #if 0 # define PRINTK(x) printk x diff -Nru linux/fs/nls/nls_base.c.orig linux/fs/nls/nls_base.c --- linux/fs/nls/nls_base.c.orig Wed Jun 26 15:36:21 2002 +++ linux/fs/nls/nls_base.c Tue Mar 4 13:38:21 2003 @@ -18,6 +18,7 @@ #ifdef CONFIG_KMOD #include #endif +#include #include static struct nls_table *tables; diff -Nru linux/fs/exec.c.orig linux/fs/exec.c --- linux/fs/exec.c.orig Tue Mar 4 12:18:52 2003 +++ linux/fs/exec.c Tue Mar 4 13:38:21 2003 @@ -444,8 +444,8 @@ active_mm = current->active_mm; current->mm = mm; current->active_mm = mm; - task_unlock(current); activate_mm(active_mm, mm); + task_unlock(current); mm_release(); if (old_mm) { if (active_mm != old_mm) BUG(); diff -Nru linux/include/linux/brlock.h.orig linux/include/linux/brlock.h --- linux/include/linux/brlock.h.orig Thu Dec 12 10:41:35 2002 +++ linux/include/linux/brlock.h Tue Mar 4 13:38:21 2003 @@ -171,11 +171,11 @@ } #else -# define br_read_lock(idx) ((void)(idx)) -# define br_read_unlock(idx) ((void)(idx)) -# define br_write_lock(idx) ((void)(idx)) -# define br_write_unlock(idx) ((void)(idx)) -#endif +# define br_read_lock(idx) ({ (void)(idx); preempt_disable(); }) +# define br_read_unlock(idx) ({ (void)(idx); preempt_enable(); }) +# define br_write_lock(idx) ({ (void)(idx); preempt_disable(); }) +# define br_write_unlock(idx) ({ (void)(idx); preempt_enable(); }) +#endif /* CONFIG_SMP */ /* * Now enumerate all of the possible sw/hw IRQ protected diff -Nru linux/include/linux/dcache.h.orig linux/include/linux/dcache.h --- linux/include/linux/dcache.h.orig Thu Dec 12 10:39:28 2002 +++ linux/include/linux/dcache.h Tue Mar 4 13:38:21 2003 @@ -127,31 +127,6 @@ extern spinlock_t dcache_lock; -/** - * d_drop - drop a dentry - * @dentry: dentry to drop - * - * d_drop() unhashes the entry from the parent - * dentry hashes, so that it won't be found through - * a VFS lookup any more. Note that this is different - * from deleting the dentry - d_delete will try to - * mark the dentry negative if possible, giving a - * successful _negative_ lookup, while d_drop will - * just make the cache lookup fail. - * - * d_drop() is used mainly for stuff that wants - * to invalidate a dentry for some reason (NFS - * timeouts or autofs deletes). - */ - -static __inline__ void d_drop(struct dentry * dentry) -{ - spin_lock(&dcache_lock); - list_del(&dentry->d_hash); - INIT_LIST_HEAD(&dentry->d_hash); - spin_unlock(&dcache_lock); -} - static __inline__ int dname_external(struct dentry *d) { return d->d_name.name != d->d_iname; @@ -276,3 +251,34 @@ #endif /* __KERNEL__ */ #endif /* __LINUX_DCACHE_H */ + +#if !defined(__LINUX_DCACHE_H_INLINES) && defined(_TASK_STRUCT_DEFINED) +#define __LINUX_DCACHE_H_INLINES + +#ifdef __KERNEL__ +/** + * d_drop - drop a dentry + * @dentry: dentry to drop + * + * d_drop() unhashes the entry from the parent + * dentry hashes, so that it won't be found through + * a VFS lookup any more. Note that this is different + * from deleting the dentry - d_delete will try to + * mark the dentry negative if possible, giving a + * successful _negative_ lookup, while d_drop will + * just make the cache lookup fail. + * + * d_drop() is used mainly for stuff that wants + * to invalidate a dentry for some reason (NFS + * timeouts or autofs deletes). + */ + +static __inline__ void d_drop(struct dentry * dentry) +{ + spin_lock(&dcache_lock); + list_del(&dentry->d_hash); + INIT_LIST_HEAD(&dentry->d_hash); + spin_unlock(&dcache_lock); +} +#endif +#endif diff -Nru linux/include/linux/fs_struct.h.orig linux/include/linux/fs_struct.h --- linux/include/linux/fs_struct.h.orig Thu Aug 23 15:24:48 2001 +++ linux/include/linux/fs_struct.h Tue Mar 4 13:38:21 2003 @@ -20,6 +20,15 @@ extern void exit_fs(struct task_struct *); extern void set_fs_altroot(void); +struct fs_struct *copy_fs_struct(struct fs_struct *old); +void put_fs_struct(struct fs_struct *fs); + +#endif +#endif + +#if !defined(_LINUX_FS_STRUCT_H_INLINES) && defined(_TASK_STRUCT_DEFINED) +#define _LINUX_FS_STRUCT_H_INLINES +#ifdef __KERNEL__ /* * Replace the fs->{rootmnt,root} with {mnt,dentry}. Put the old values. * It can block. Requires the big lock held. @@ -65,9 +74,5 @@ mntput(old_pwdmnt); } } - -struct fs_struct *copy_fs_struct(struct fs_struct *old); -void put_fs_struct(struct fs_struct *fs); - #endif #endif diff -Nru linux/include/linux/sched.h.orig linux/include/linux/sched.h --- linux/include/linux/sched.h.orig Tue Mar 4 12:18:56 2003 +++ linux/include/linux/sched.h Tue Mar 4 13:38:21 2003 @@ -91,6 +91,7 @@ #define TASK_UNINTERRUPTIBLE 2 #define TASK_ZOMBIE 4 #define TASK_STOPPED 8 +#define PREEMPT_ACTIVE 0x4000000 #define __set_task_state(tsk, state_value) \ do { (tsk)->state = (state_value); } while (0) @@ -157,6 +158,9 @@ #define MAX_SCHEDULE_TIMEOUT LONG_MAX extern signed long FASTCALL(schedule_timeout(signed long timeout)); asmlinkage void schedule(void); +#ifdef CONFIG_PREEMPT +asmlinkage void preempt_schedule(void); +#endif extern int schedule_task(struct tq_struct *task); extern void flush_scheduled_tasks(void); @@ -289,7 +293,7 @@ * offsets of these are hardcoded elsewhere - touch with care */ volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */ - unsigned long flags; /* per process flags, defined below */ + int preempt_count; /* 0 => preemptable, <0 => BUG */ int sigpending; mm_segment_t addr_limit; /* thread address space: 0-0xBFFFFFFF for user-thead @@ -331,6 +335,7 @@ struct mm_struct *active_mm; struct list_head local_pages; unsigned int allocation_order, nr_local_pages; + unsigned long flags; /* task state */ struct linux_binfmt *binfmt; @@ -959,5 +964,10 @@ __cond_resched(); } +#define _TASK_STRUCT_DEFINED +#include +#include +#include + #endif /* __KERNEL__ */ #endif diff -Nru linux/include/linux/smp_lock.h.orig linux/include/linux/smp_lock.h --- linux/include/linux/smp_lock.h.orig Thu Dec 12 10:39:28 2002 +++ linux/include/linux/smp_lock.h Tue Mar 4 13:38:21 2003 @@ -3,7 +3,7 @@ #include -#ifndef CONFIG_SMP +#if !defined(CONFIG_SMP) && !defined(CONFIG_PREEMPT) #define lock_kernel() do { } while(0) #define unlock_kernel() do { } while(0) diff -Nru linux/include/linux/spinlock.h.orig linux/include/linux/spinlock.h --- linux/include/linux/spinlock.h.orig Wed Jan 29 15:33:36 2003 +++ linux/include/linux/spinlock.h Tue Mar 4 13:38:21 2003 @@ -2,6 +2,7 @@ #define __LINUX_SPINLOCK_H #include +#include #include @@ -64,8 +65,10 @@ #if (DEBUG_SPINLOCKS < 1) +#ifndef CONFIG_PREEMPT #define atomic_dec_and_lock(atomic,lock) atomic_dec_and_test(atomic) #define ATOMIC_DEC_AND_LOCK +#endif /* * Your basic spinlocks, allowing only a single CPU anywhere @@ -82,11 +85,11 @@ #endif #define spin_lock_init(lock) do { } while(0) -#define spin_lock(lock) (void)(lock) /* Not "unused variable". */ +#define _raw_spin_lock(lock) (void)(lock) /* Not "unused variable". */ #define spin_is_locked(lock) (0) -#define spin_trylock(lock) ({1; }) +#define _raw_spin_trylock(lock) ({1; }) #define spin_unlock_wait(lock) do { } while(0) -#define spin_unlock(lock) do { } while(0) +#define _raw_spin_unlock(lock) do { } while(0) #elif (DEBUG_SPINLOCKS < 2) @@ -146,13 +149,78 @@ #endif #define rwlock_init(lock) do { } while(0) -#define read_lock(lock) (void)(lock) /* Not "unused variable". */ -#define read_unlock(lock) do { } while(0) -#define write_lock(lock) (void)(lock) /* Not "unused variable". */ -#define write_unlock(lock) do { } while(0) +#define _raw_read_lock(lock) (void)(lock) /* Not "unused variable". */ +#define _raw_read_unlock(lock) do { } while(0) +#define _raw_write_lock(lock) (void)(lock) /* Not "unused variable". */ +#define _raw_write_unlock(lock) do { } while(0) #endif /* !SMP */ +#ifdef CONFIG_PREEMPT + +#define preempt_get_count() (current->preempt_count) +#define preempt_is_disabled() (preempt_get_count() != 0) + +#define preempt_disable() \ +do { \ + ++current->preempt_count; \ + barrier(); \ +} while (0) + +#define preempt_enable_no_resched() \ +do { \ + --current->preempt_count; \ + barrier(); \ +} while (0) + +#define preempt_enable() \ +do { \ + --current->preempt_count; \ + barrier(); \ + if (unlikely(current->preempt_count < current->need_resched)) \ + preempt_schedule(); \ +} while (0) + +#define spin_lock(lock) \ +do { \ + preempt_disable(); \ + _raw_spin_lock(lock); \ +} while(0) + +#define spin_trylock(lock) ({preempt_disable(); _raw_spin_trylock(lock) ? \ + 1 : ({preempt_enable(); 0;});}) +#define spin_unlock(lock) \ +do { \ + _raw_spin_unlock(lock); \ + preempt_enable(); \ +} while (0) + +#define read_lock(lock) ({preempt_disable(); _raw_read_lock(lock);}) +#define read_unlock(lock) ({_raw_read_unlock(lock); preempt_enable();}) +#define write_lock(lock) ({preempt_disable(); _raw_write_lock(lock);}) +#define write_unlock(lock) ({_raw_write_unlock(lock); preempt_enable();}) +#define write_trylock(lock) ({preempt_disable();_raw_write_trylock(lock) ? \ + 1 : ({preempt_enable(); 0;});}) + +#else + +#define preempt_get_count() (0) +#define preempt_is_disabled() (1) +#define preempt_disable() do { } while (0) +#define preempt_enable_no_resched() do {} while(0) +#define preempt_enable() do { } while (0) + +#define spin_lock(lock) _raw_spin_lock(lock) +#define spin_trylock(lock) _raw_spin_trylock(lock) +#define spin_unlock(lock) _raw_spin_unlock(lock) + +#define read_lock(lock) _raw_read_lock(lock) +#define read_unlock(lock) _raw_read_unlock(lock) +#define write_lock(lock) _raw_write_lock(lock) +#define write_unlock(lock) _raw_write_unlock(lock) +#define write_trylock(lock) _raw_write_trylock(lock) +#endif + /* "lock on reference count zero" */ #ifndef ATOMIC_DEC_AND_LOCK #include diff -Nru linux/include/linux/tqueue.h.orig linux/include/linux/tqueue.h --- linux/include/linux/tqueue.h.orig Thu Dec 12 10:39:28 2002 +++ linux/include/linux/tqueue.h Tue Mar 4 13:38:21 2003 @@ -94,6 +94,22 @@ extern spinlock_t tqueue_lock; /* + * Call all "bottom halfs" on a given list. + */ + +extern void __run_task_queue(task_queue *list); + +static inline void run_task_queue(task_queue *list) +{ + if (TQ_ACTIVE(*list)) + __run_task_queue(list); +} + +#endif /* _LINUX_TQUEUE_H */ + +#if !defined(_LINUX_TQUEUE_H_INLINES) && defined(_TASK_STRUCT_DEFINED) +#define _LINUX_TQUEUE_H_INLINES +/* * Queue a task on a tq. Return non-zero if it was successfully * added. */ @@ -109,17 +125,4 @@ } return ret; } - -/* - * Call all "bottom halfs" on a given list. - */ - -extern void __run_task_queue(task_queue *list); - -static inline void run_task_queue(task_queue *list) -{ - if (TQ_ACTIVE(*list)) - __run_task_queue(list); -} - -#endif /* _LINUX_TQUEUE_H */ +#endif diff -Nru linux/kernel/exit.c.orig linux/kernel/exit.c --- linux/kernel/exit.c.orig Wed Jan 29 15:33:36 2003 +++ linux/kernel/exit.c Tue Mar 4 13:38:21 2003 @@ -313,8 +313,8 @@ /* more a memory barrier than a real lock */ task_lock(tsk); tsk->mm = NULL; - task_unlock(tsk); enter_lazy_tlb(mm, current, smp_processor_id()); + task_unlock(tsk); mmput(mm); } } @@ -435,6 +435,11 @@ tsk->flags |= PF_EXITING; del_timer_sync(&tsk->real_timer); + if (unlikely(preempt_get_count())) + printk(KERN_INFO "note: %s[%d] exited with preempt_count %d\n", + current->comm, current->pid, + preempt_get_count()); + fake_volatile: #ifdef CONFIG_BSD_PROCESS_ACCT acct_process(code); diff -Nru linux/kernel/fork.c.orig linux/kernel/fork.c --- linux/kernel/fork.c.orig Wed Sep 11 05:45:42 2002 +++ linux/kernel/fork.c Tue Mar 4 13:38:21 2003 @@ -629,6 +629,13 @@ if (p->binfmt && p->binfmt->module) __MOD_INC_USE_COUNT(p->binfmt->module); +#ifdef CONFIG_PREEMPT + /* + * Continue with preemption disabled as part of the context + * switch, so start with preempt_count set to 1. + */ + p->preempt_count = 1; +#endif p->did_exec = 0; p->swappable = 0; p->state = TASK_UNINTERRUPTIBLE; diff -Nru linux/kernel/ksyms.c.orig linux/kernel/ksyms.c --- linux/kernel/ksyms.c.orig Tue Mar 4 12:18:56 2003 +++ linux/kernel/ksyms.c Tue Mar 4 13:38:21 2003 @@ -451,6 +451,9 @@ EXPORT_SYMBOL(interruptible_sleep_on); EXPORT_SYMBOL(interruptible_sleep_on_timeout); EXPORT_SYMBOL(schedule); +#ifdef CONFIG_PREEMPT +EXPORT_SYMBOL(preempt_schedule); +#endif EXPORT_SYMBOL(schedule_timeout); EXPORT_SYMBOL(yield); EXPORT_SYMBOL(__cond_resched); diff -Nru linux/kernel/sched.c.orig linux/kernel/sched.c --- linux/kernel/sched.c.orig Tue Mar 4 12:18:56 2003 +++ linux/kernel/sched.c Tue Mar 4 13:38:21 2003 @@ -489,7 +489,7 @@ task_lock(prev); task_release_cpu(prev); mb(); - if (prev->state == TASK_RUNNING) + if (task_on_runqueue(prev)) goto needs_resched; out_unlock: @@ -519,7 +519,7 @@ goto out_unlock; spin_lock_irqsave(&runqueue_lock, flags); - if ((prev->state == TASK_RUNNING) && !task_has_cpu(prev)) + if (task_on_runqueue(prev) && !task_has_cpu(prev)) reschedule_idle(prev); spin_unlock_irqrestore(&runqueue_lock, flags); goto out_unlock; @@ -532,6 +532,7 @@ asmlinkage void schedule_tail(struct task_struct *prev) { __schedule_tail(prev); + preempt_enable(); } /* @@ -551,9 +552,10 @@ struct list_head *tmp; int this_cpu, c; - spin_lock_prefetch(&runqueue_lock); + preempt_disable(); + BUG_ON(!current->active_mm); need_resched_back: prev = current; @@ -581,6 +583,14 @@ move_last_runqueue(prev); } +#ifdef CONFIG_PREEMPT + /* + * entering from preempt_schedule, off a kernel preemption, + * go straight to picking the next task. + */ + if (unlikely(preempt_get_count() & PREEMPT_ACTIVE)) + goto treat_like_run; +#endif switch (prev->state) { case TASK_INTERRUPTIBLE: if (signal_pending(prev)) { @@ -591,6 +601,9 @@ del_from_runqueue(prev); case TASK_RUNNING:; } +#ifdef CONFIG_PREEMPT + treat_like_run: +#endif prev->need_resched = 0; /* @@ -699,9 +712,31 @@ reacquire_kernel_lock(current); if (current->need_resched) goto need_resched_back; + preempt_enable_no_resched(); return; } +#ifdef CONFIG_PREEMPT +/* + * this is is the entry point to schedule() from in-kernel preemption + */ +asmlinkage void preempt_schedule(void) +{ + if (unlikely(irqs_disabled())) + return; + +need_resched: + current->preempt_count += PREEMPT_ACTIVE; + schedule(); + current->preempt_count -= PREEMPT_ACTIVE; + + /* we could miss a preemption opportunity between schedule and now */ + barrier(); + if (unlikely(current->need_resched)) + goto need_resched; +} +#endif /* CONFIG_PREEMPT */ + /* * The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just wake everything * up. If it's an exclusive wakeup (nr_exclusive == small +ve number) then we wake all the @@ -1327,6 +1362,13 @@ sched_data->curr = current; sched_data->last_schedule = get_cycles(); clear_bit(current->processor, &wait_init_idle); +#ifdef CONFIG_PREEMPT + /* + * fix up the preempt_count for non-CPU0 idle threads + */ + if (current->processor) + current->preempt_count = 0; +#endif } extern void init_timervecs (void); diff -Nru linux/lib/dec_and_lock.c.orig linux/lib/dec_and_lock.c --- linux/lib/dec_and_lock.c.orig Mon Nov 5 12:16:32 2001 +++ linux/lib/dec_and_lock.c Tue Mar 4 13:38:21 2003 @@ -1,5 +1,6 @@ #include #include +#include #include /* diff -Nru linux/mm/slab.c.orig linux/mm/slab.c --- linux/mm/slab.c.orig Tue Mar 4 12:18:56 2003 +++ linux/mm/slab.c Tue Mar 4 13:38:21 2003 @@ -49,7 +49,8 @@ * constructors and destructors are called without any locking. * Several members in kmem_cache_t and slab_t never change, they * are accessed without any locking. - * The per-cpu arrays are never accessed from the wrong cpu, no locking. + * The per-cpu arrays are never accessed from the wrong cpu, no locking, + * and local interrupts are disabled so slab code is preempt-safe. * The non-constant members are protected with a per-cache irq spinlock. * * Further notes from the original documentation: diff -Nru linux/net/core/dev.c.orig linux/net/core/dev.c --- linux/net/core/dev.c.orig Tue Mar 4 12:18:56 2003 +++ linux/net/core/dev.c Tue Mar 4 13:38:21 2003 @@ -1049,9 +1049,15 @@ int cpu = smp_processor_id(); if (dev->xmit_lock_owner != cpu) { + /* + * The spin_lock effectivly does a preempt lock, but + * we are about to drop that... + */ + preempt_disable(); spin_unlock(&dev->queue_lock); spin_lock(&dev->xmit_lock); dev->xmit_lock_owner = cpu; + preempt_enable(); if (!netif_queue_stopped(dev)) { if (netdev_nit) diff -Nru linux/net/core/skbuff.c.orig linux/net/core/skbuff.c --- linux/net/core/skbuff.c.orig Wed Jun 26 15:36:49 2002 +++ linux/net/core/skbuff.c Tue Mar 4 13:38:21 2003 @@ -111,33 +111,37 @@ static __inline__ struct sk_buff *skb_head_from_pool(void) { - struct sk_buff_head *list = &skb_head_pool[smp_processor_id()].list; + struct sk_buff_head *list; + struct sk_buff *skb = NULL; + unsigned long flags; - if (skb_queue_len(list)) { - struct sk_buff *skb; - unsigned long flags; + local_irq_save(flags); - local_irq_save(flags); + list = &skb_head_pool[smp_processor_id()].list; + + if (skb_queue_len(list)) skb = __skb_dequeue(list); - local_irq_restore(flags); - return skb; - } - return NULL; + + local_irq_restore(flags); + return skb; } static __inline__ void skb_head_to_pool(struct sk_buff *skb) { - struct sk_buff_head *list = &skb_head_pool[smp_processor_id()].list; + struct sk_buff_head *list; + unsigned long flags; - if (skb_queue_len(list) < sysctl_hot_list_len) { - unsigned long flags; + local_irq_save(flags); + list = &skb_head_pool[smp_processor_id()].list; - local_irq_save(flags); + if (skb_queue_len(list) < sysctl_hot_list_len) { __skb_queue_head(list, skb); local_irq_restore(flags); return; } + + local_irq_restore(flags); kmem_cache_free(skbuff_head_cache, skb); } diff -Nru linux/net/sunrpc/pmap_clnt.c.orig linux/net/sunrpc/pmap_clnt.c --- linux/net/sunrpc/pmap_clnt.c.orig Wed Jun 26 15:36:51 2002 +++ linux/net/sunrpc/pmap_clnt.c Tue Mar 4 13:38:21 2003 @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include diff -Nru linux/net/socket.c.orig linux/net/socket.c --- linux/net/socket.c.orig Tue Mar 4 12:18:56 2003 +++ linux/net/socket.c Tue Mar 4 13:38:21 2003 @@ -132,7 +132,7 @@ static struct net_proto_family *net_families[NPROTO]; -#ifdef CONFIG_SMP +#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT) static atomic_t net_family_lockct = ATOMIC_INIT(0); static spinlock_t net_family_lock = SPIN_LOCK_UNLOCKED; diff -Nru linux/CREDITS.orig linux/CREDITS --- linux/CREDITS.orig Tue Mar 4 12:17:38 2003 +++ linux/CREDITS Tue Mar 4 13:38:21 2003 @@ -1001,8 +1001,8 @@ N: Nigel Gamble E: nigel@nrg.org -E: nigel@sgi.com D: Interrupt-driven printer driver +D: Preemptible kernel S: 120 Alley Way S: Mountain View, California 94040 S: USA diff -Nru linux/MAINTAINERS.orig linux/MAINTAINERS --- linux/MAINTAINERS.orig Tue Mar 4 12:17:39 2003 +++ linux/MAINTAINERS Tue Mar 4 13:38:21 2003 @@ -1332,6 +1332,14 @@ M: mostrows@styx.uwaterloo.ca S: Maintained +PREEMPTIBLE KERNEL +P: Robert M. Love +M: rml@tech9.net +L: linux-kernel@vger.kernel.org +L: kpreempt-tech@lists.sourceforge.net +W: http://tech9.net/rml/linux +S: Supported + PROMISE DC4030 CACHING DISK CONTROLLER DRIVER P: Peter Denison M: promise@pnd-pc.demon.co.uk