NFS related news

An old perl script

Kool Aid Served Daily - 12 hours 8 min ago

I've got an old perl script that I have gotten a lot of mileage from:

package read_txtfile_format; sub main'read_txtfile_format { local(*file,*format) = @_; local($first_line, $first_char) = ''; do { $first_line = <file>; $first_line =~ /(.)(.*)/; $first_char = $1; $first_line = $2; } until ($first_char eq "!" || eof(file)); if (eof(file)) { die "There is no ! header line in $file"; } $format = '$' . join(', $', split(/,/, $first_line)); }

I didn't write it, I think either Mark Lawrence or Walt Gaber did while I was at DRD Corporation. Or they got it somewhere. I know they called it a data dictionary - which I'm not sure is a way I would use that term these days. By the way, I still have and use my very beaten up copy of Programming Perl from back then - a 1991 printing.

What it does is allow you to read in another file, generate variable names based on a line starting with a '!' and then use those names per line. It is a cheap database laid out in a flat file.

I think we called this basic script and its associated text files data dictionaries because we would use it to quickly prototype and change data structures in C. I know I used it in my Genetic Programming research to describe the operators used in a new problem set.

Perhaps an example will show the power.

Resume example

I want to quickly take my resume and reformat it as needed. Perhaps I need it in html format, a plain text file, etc.

I can keep my data in a file, I can have a skeleton script to process it, and I can quickly change it to adapt to new styles...

I'm picking an example which looks like I'm pimping myself out because I thought it was quirky and fun to code. It is also a way I never would have thought to do an example with this piece of code.

Data file !started,ended,title,company,jobdesc 1/05,present,Staff Engineer Software,Sun Microsystems,NFS development 6/01,12/05,File System Engineer,Network Appliance,WAFL and NFS development 4/01,6/01,Manager,Network Appliance,Manager of Engineering Internal Test 10/99,4/01,System Administrator,Network Appliance,Perl hacker and filer administrator Perl script #! /usr/bin/perl do 'getthead.pl'; open(LNG_FILE, $ARGV[0]) || die "Can't open LNG_FILE: $!\n"; # Determine the Column Names do main'read_txtfile_format(*LNG_FILE, *languages); lang: while (<LNG_FILE>) { next lang if (/^#/ || /^!/); eval "($languages) = split(/[,\n]/)"; print "$started - $ended: $title for $company\n\t$description\n\n"; } Results

And here I have it dumping out a format much like the resume.txt file I have been updating as I change job functions:

> ./resume.pl resume.txt 1/05 - present: Staff Engineer Software for Sun Microsystems NFS development 6/01 - 12/05: File System Engineer for Network Appliance WAFL and NFS development 4/01 - 6/01: Manager for Network Appliance Manager of Engineering Internal Test 10/99 - 4/01: System Administrator for Network Appliance Perl hacker and filer administrator But wait, there is an easier way #! /usr/bin/perl open(LNG_FILE, $ARGV[0]) || die "Can't open LNG_FILE: $!\n"; lang: while (<LNG_FILE>) { next lang if (/^#/ || /^!/); ($started, $ended, $title, $company, $description) = split(/[,\n]/); print "$started - $ended: $title for $company\n\t$description\n\n"; }

It does the same thing, less code as well.

But it isn't as dynamic. I have to edit both the data file and the script to make a change. If I were to add a new field location after company, I would have to change the script on the split. Also, what if I have many scripts manipulating the same data? During my research, I had two different data files and six different scripts per problem set.

Research examples

For clique detection, 5-6 lines of data dictionary entries resulted in about 400 lines of C code. For predator/prey, about 50 lines of data dictionary entries resulted in about 890 lines of C code.

An example set of data dictionary entries for the predator/prey would be:

!fId:fSymbol:fType:fArity:fMacro:fDifGen:fChild1:fChild2:fChild3:fChild4:fChild5:fDescription Agent:Ag:Agent:1:True:False:Agent:NO_CHILD:NO_CHILD:NO_CHILD:NO_CHILD:Returns A's predatorId. And:&&:Boolean:2:True:False:Boolean:Boolean:NO_CHILD:NO_CHILD:NO_CHILD:A AND B. CellOf:CellOf:Cell:2:False:False:Agent:Tack:NO_CHILD:NO_CHILD:NO_CHILD:The (X,Y) coordinate of agent A if it moves from its current cell to the one in the Tack B.

(Note I've changed the separator to a ':' for clarity.)

Part of the processing script would be:

# Put in all Caps ( $capId = 'GP_L' . $LBranch . '_F_' . $fId ) =~ tr/a-z/A-Z/; ... print INI_FP ' /*'; print INI_FP ' * ' . $fDescription; print INI_FP ' */'; print INI_FP ' pgps->als[' . $LBranch . '].afs[i].iId = ' . $capId . ';'; print INI_FP ' pgps->als[' . $LBranch . '].afs[i].psSymbol = "' . $fSymbol . '";'; print INI_FP ' pgps->als[' . $LBranch . '].afs[i].ftType = ' . $capType . ';'; print INI_FP ' pgps->als[' . $LBranch . '].afs[i].arity = ' . $fArity . ';'; ...

And some resulting code would be:

/* * Branch 0 - Main language for the system */ /* * Branch 0 - Functions */ pgps->als[0].afs = (FunctionsStruct *)calloc( GP_L0_MAX_FUNCTIONS, GP_FUNCTIONS_SIZE ); if ( !pgps->als[0].afs ) { GU_logError( stderr, "%s(%d): Out of Memory!\n", __FILE__, __LINE__ ); GU_exit ( -1 ); } /* * Returns A's predatorId. */ pgps->als[0].afs[i].iId = GP_L0_F_AGENT; pgps->als[0].afs[i].psSymbol = "Ag"; pgps->als[0].afs[i].ftType = FT_L0_E_AGENT; pgps->als[0].afs[i].arity = 1; pgps->als[0].afs[i].bMacro = TRUE; pgps->als[0].afs[i].bActive = TRUE; pgps->als[0].afs[i].bDifGeneric = FALSE; pgps->als[0].afs[i].pftChildren = (FunctionTypes *)calloc( pgps->als[0].afs[i].arity, FT_TYPES_SIZE ); if ( !pgps->als[0].afs[i].pftChildren ) { GU_logError( stderr, "%s(%d): Out of Memory!\n", __FILE__, __LINE__ ); GU_exit ( -1 ); } pgps->als[0].afs[i].pftChildren[0] = FT_L0_E_AGENT; pgps->als[0].afs[i].pFct = gpf_L0_Agent; i++; /* * A AND B. */ pgps->als[0].afs[i].iId = GP_L0_F_AND; pgps->als[0].afs[i].psSymbol = "&&";

By the way, the same script f_types.pl would process all of the language data dictionaries without being modified. If I happened to change the underlying data structures in the C code, i.e., FunctionsStruct, I could change that one script and rebuild all of the different languages.

So where's the Beef?

Why the walk down memory lane?

Well, I still use this script. I've used it to do volunteer scheduling at AAAI, generate Java opcodes for a simple JVM implementation, plan a new company, check for sibling conflicts during a recreational soccer season, implement testbeds for QA efforts for multiple companies, etc. I don't have to have a database on my system. I can suck data out of a database on a Windows box, store it in a CSV datafile on OpenSolaris, and play with the data. I don't have to know SQL and/or care too much about the data. I can generate "reports" and such from the CLI.

And it is the power of Perl (well, the eval() it offers) which lets me get away with this. One of the selling points of Perl for me was rapid prototyping, especially with respect to strings. I could have written C programs to do all of this, but why?

If I'm going to learn Python, I need to be able to replace this piece of functionality. Or else I'll be back with Perl before you know it.

And honestly, even if I learn to make Python bark for me, I'll pick up the tool I need when I have to. :->

Well off to sleep and I'll pick this up tomorrow when I start playing with Python.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

This is my 501st entry

Kool Aid Served Daily - Tue, 10/07/2008 - 2:32am

The site is telling me I've hit 500 blog entries:

Kool Aid Served Daily Link http://blogs.sun.com/tdh Permission ADMIN Description For the Adrenaline Junkie Members: 1 Today's hits: 3240 New Entry Entries (500) Comments (281) Theme Settings

And they said this Internety thing would never take off!

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Pushing your open gate to OpenSolaris

Kool Aid Served Daily - Tue, 10/07/2008 - 2:25am

In Setting up a Development Project Gate, I showed you the steps I took to setup a shared development gate. The only thing I left out was the automatic push to a Mercurial repository on OpenSolaris. I'll take you through these steps now.

By the way, thanks to Dave Marker for sharing how ONNV does this and writing some Python tools to automate the majority of the push.

Get a repository

First you need to get a project leader for your project to create a workspace for you on hg.opensolaris.org. They can do this by logging to OpenSolaris and going to the project page. Then they select 'SCM Management'. Next 'Add Repository' and remember to click the Mercurial radio button on the next screen. The page is pretty explanatory and the only thing to remember is that it might spew up a message about the project already existing. It also seems to take 5-10 minutes for the project space to show up.

Which if you know about it is okay. You can take this time to configure who has commit rights. (You'll need one account for sure, more on that later.)

As evidenced in Setting up a NFS41 gate, I had a hard time figuring out where the gate was located. It can be reached as:

hg clone ssh://anon@hg.opensolaris.org/hg/<PROJECT>/<GATE-NAME>

Note you need to configure anonymous access on the repository page for this to work. I'm not sure, but I believe this also implies anyone can write back to the gate. Tie this in with a lack of tools to maintain the gate out there (including deletion) and you can see this might be a problem.

Configure a SSH public key

This part is the scary part - scary in that you have to decide if you want automatic pushes or manual pushes.

Manual pushes

If you want manual pushes, then you need to make sure you have a SSH public key setup as described in opensolaris.org SSH key help. Then whenever you want to push a change, you would do:

hg push -R -e "ssh -q -F ssh://<YOUR OSOL USERNAME>@hg.opensolaris.org/hg/<PROJECT>/<GATE-NAME>

You don't have to read much more of the rest of this entry, except perhaps to make sure your proxy host is correct.

Automatic pushes

If you want automatic pushes, then you need to configure a special key pair without a passphrase. Note that this is a risk you need to manage yourself. It needs to be blank because you don't want to even store the passphrase anywhere. I've done this type of thing in the past for production systems and the trick is that you want to lock down the box you have stored the private version of the key

Follow the directions as described in opensolaris.org SSH key help to publish this key. Note that no-one will know that this passphrase is empty for this key. They still need the private copy of the key. We have to make sure that is locked down tight!

  • Use a system with a non-standard root password.
  • Strictly control admin access to the system
    • Don't hand out the root password.
    • Don't make sudo too permissive.
  • Pick a non-global account to store the data in.
  • Make sure the permissions are 700 on the directory where the key is stored.
  • If you share that homedir, make sure to read The myth of security with AUTH_SYS, which means:
    • Do not accept the default access list ooptions.
    • Set root= or use anon=-1.
    • Set the rw= and/or ro= hosts appropriately.
    • Strongly consider kerberizing the share.

There is probably more you can do, so the first thing is to accept responsibility for this setup.

Okay, having scared you, it is now time to configure this special ssh configuration:

[nfs4hg@aus1500-home ~/tmp]> mkdir opensolaris [nfs4hg@aus1500-home ~/tmp]> chmod 700 opensolaris/ [nfs4hg@aus1500-home ~/tmp]> cp ~/opensolaris/* . [nfs4hg@aus1500-home ~/tmp]> more config Host *.org GlobalKnownHostsFile /pool/nfs4hg/opensolaris/known_hosts ProxyCommand /usr/lib/ssh/ssh-socks5-proxy-connect -h 192.18.43.19 %h %p IdentityFile /pool/nfs4hg/opensolaris/id_dsa User <YOUR OSOL USERNAME>

You'll have to figure out whether or not you need the proxy configuration or not. And you will need to fix-up the paths for the GlobalKnownHostsFile and IdentityFile.

Also note that I've got this account already setup with a ~/.ssh directory to allow integrations to the gate. It doen't have DS keys, but I like to keep these apart. And note that id_dsa and id_dsa.txt are the keypair that you generated with the empty passphrase.

Edit hook/updateoso.py

In the copy of hook/updateoso.py I got from ssh://anon@hg.opensolaris.org/hg/scm-migration/onnv-gk-tools, I had the following minor diffs:

[nfs4hg@aus1500-home ~]> diff /pool/onnv-gk-tools/hook/updateoso.py onnv-gk-tools/hook/updateoso.py 33c33 < OSOREPO = "ssh://hg.opensolaris.org/hg/nfsv41/nfs41-gate" --- > OSOREPO = "ssh://hg.opensolaris.org/hg/onnv/onnv-gate" 43c43 < This script must be run as user "nfs4hg". --- > This script must be run as user "onhg". 76c76 < home = pwd.getpwnam("nfs4hg")[5] --- > home = pwd.getpwnam("onhg")[5] 95c95 < if utility.check_user("nfs4hg", m) is False: --- > if utility.check_user("onhg", m) is False: 97c97 < m.write('''Can't update OpenSolaris. User != "nfs4hg"\n''') --- > m.write('''Can't update OpenSolaris. User != "onhg"\n''')

Note that I grabbed this script right after Dave Marker put it back, so I got a cutting edge version of it. I know he is about to change it such that the user and osol path are configurable variables outside of this file. I.e., you would make changes in a project specific configuration file.

Anyway, once those changes are made, go a head and make sure this line is enabled in your official clone's hgrc:

# These hooks are run from bghook() in the background bg-changegroup.0 = python:hook.updateoso.updateoso

I'm assuming you have modeled your development gate ala onnv (and as I described in Setting up a Development Project Gate). If you haven't, then I think all you need to do is add this as the last bg.changeroup entry in your gate's hgrc.

And now you are good to go!

You might want to do the manual push I described above to seed the gate.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Trying to figure out printing and variables in Python

Kool Aid Served Daily - Mon, 10/06/2008 - 11:06pm

I'm pretty used to referencing variables inside print blocks in Perl. I'm not at all comfortable with Python. I have a block of code that I want to change the 'onhg' to come out of a config file. So I set up a scratch directory and make a bare bones implementation:

[th199096@jhereg etc]> ls -laiR ~/scratch/ /home/th199096/scratch/: total 42 749 drwxr-xr-x 3 th199096 staff 4 Oct 6 16:36 . 3 drwxr-xr-x 39 th199096 staff 55 Oct 6 16:35 .. 752 drwxr-xr-x 2 th199096 staff 6 Oct 6 16:42 etc 750 -rwxr-xr-x 1 th199096 staff 1297 Oct 6 16:42 updateoso.py /home/th199096/scratch/etc: total 25 752 drwxr-xr-x 2 th199096 staff 6 Oct 6 16:42 . 749 drwxr-xr-x 3 th199096 staff 4 Oct 6 16:36 .. 753 -rw-r--r-- 1 th199096 staff 1052 Oct 6 16:40 __init__.py 754 -rw-r--r-- 1 th199096 staff 243 Oct 6 16:41 __init__.pyc 751 -rwxr-xr-x 1 th199096 staff 94 Oct 6 16:42 config.py 756 -rw-r--r-- 1 th199096 staff 257 Oct 6 16:42 config.pyc

Where the config file simply has:

GATE_USER = "onhg" GATE_GROUP = "gk" OSOREPO = "ssh://hg.opensolaris.org/hg/onnv/onnv-gate"

And the updateoso has:

import os, pwd, subprocess, sys from mercurial import hg, repo from mercurial.node import hex sys.path.insert(1, os.path.realpath(os.path.join(os.path.dirname(__file__), ".."))) from etc import config __USAGE = """ updateoso.py [-n] -n: dry run, no email sent (displayed on stdout) -R: root dir of repo (where .hg is) Attempt to send changes to %s This script must be run as user "onhg". You should set up RBAC and use pfexec(1). """ % (config.OSOREPO) __USAGE = __USAGE.strip() print >> sys.stderr, __USAGE

Well, in isolation, I can already see what I am going to have to do. All I need to do is replace the 'ohng' with a %s and add a second argument:

[th199096@jhereg ~/scratch]> diff updateoso.py updateoso.py.first 39c39 < This script must be run as user "%s". --- > This script must be run as user "ohng". 41c41 < """ % (config.OSOREPO, config.GATE_USER) --- > """ % (config.OSOREPO)

And we get:

[th199096@jhereg ~/scratch]> ./updateoso.py updateoso.py [-n] -n: dry run, no email sent (displayed on stdout) -R: root dir of repo (where .hg is) Attempt to send changes to ssh://hg.opensolaris.org/hg/onnv/onnv-gate This script must be run as user "onhg". You should set up RBAC and use pfexec(1).

I ought to be able to test this inside the interactive shell:

[th199096@jhereg ~/scratch]> python Python 2.4.4 (#1, Aug 25 2008, 03:30:42) [C] on sunos5 Type "help", "copyright", "credits" or "license" for more information. >>> import updateoso updateoso.py [-n] -n: dry run, no email sent (displayed on stdout) -R: root dir of repo (where .hg is) Attempt to send changes to ssh://hg.opensolaris.org/hg/onnv/onnv-gate This script must be run as user "onhg". You should set up RBAC and use pfexec(1). >>> config.GATE_USER = "duke" Traceback (most recent call last): File "", line 1, in ? NameError: name 'config' is not defined

Okay, I should have known that wasn't going to work. It would probably work in the code (we'll see later), but for now this will work:

>>> updateoso.config.GATE_USER = "duke" >>> reload(updateoso) updateoso.py [-n] -n: dry run, no email sent (displayed on stdout) -R: root dir of repo (where .hg is) Attempt to send changes to ssh://hg.opensolaris.org/hg/onnv/onnv-gate This script must be run as user "duke". You should set up RBAC and use pfexec(1).

To be honest, I knew the reference would work, but I expected it to be reset. In retrospect, I can see that I reloaded updateoso and etc/config. Just something to get used to. I could force it to 'reset' via:

>>> reload(etc/config) Traceback (most recent call last): File "", line 1, in ? NameError: name 'etc' is not defined >>> from etc reload(config) File "", line 1 from etc reload(config) ^ SyntaxError: invalid syntax >>> reload(updateoso.config) >>> reload(updateoso) updateoso.py [-n] -n: dry run, no email sent (displayed on stdout) -R: root dir of repo (where .hg is) Attempt to send changes to ssh://hg.opensolaris.org/hg/onnv/onnv-gate This script must be run as user "onhg". You should set up RBAC and use pfexec(1).

Took me a bit to figure out the syntax.

Okay, can I see the change from the script:

[th199096@jhereg ~/scratch]> diff updateoso.py updateoso.py.second 45,48d44 sys.stderr, __USAGE

I don't expect this to work. And it doesn't.

This script must be run as user "onhg". ... This script must be run as user "onhg".

How about a test driver script?

[th199096@jhereg ~/scratch]> more test.py import updateoso print "Now change the user" updateoso.config.GATE_USER = "nark" reload(updateoso)

And that works:

This script must be run as user "onhg". ... Now change the user ... This script must be run as user "nark". Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Power outage scared the bejsesus out of me, but not my Sun Ray

Kool Aid Served Daily - Mon, 10/06/2008 - 9:04pm

I just had a mini-power outage where all of my screens and printer went off. So did my Sun Ray. My heart is still beating fast from the shock.

I was right in the middle of an important editing session and was just then doing a save. All I could think of was did I get it in time or not?

It didn't matter - my Sun Ray server is on UPS. My Sun Ray had powered back up and I unlocked the screen in under 30 seconds. I could even see that I had just saved the file.

In case you can't tell, I love my Sun Ray setup. My office is quiet, my "machine room" is loud.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Learning a new language - python

Kool Aid Served Daily - Mon, 10/06/2008 - 4:07am

So I decided to learn python - why? because it is used by Mercurial. And there is at least one of the gatekeeping scripts which I needed to hack for the nfs41-gate.

I bought Learning Python, Third Edition by Mark Lutz because the local Borders did not have Programming Python, Third Edition. From some reviews I read, it would probably have been a better fit for me.

I know I can find most of what I want on the net, but I wanted a printed resource.

Anyway, I had a question right off the bat - about whether if file a imports modules b and c, what happens if c also imports b? Deeper in the book that I've read, it does state that an import is equivalent to load the file if it is not already loaded. But that doesn't help me learn the language. :->

The following example is quite simple, but effective in answering the question for me:

a.py > cat a.py #!/usr/bin/python title = "This is the file a.py!" print title print "importing b from a" import b print "importing c from a" import c b.py > cat b.py #!/usr/bin/python title = "This is the file b.py!" print title c.py > cat c.py #!/usr/bin/python import b title = "This is the file c.py!" print title Test 1 > ./a.py This is the file a.py! importing b from a This is the file b.py! importing c from a This is the file c.py!

So we see that if a has loaded it, then c will not. How about the other way?

a2.py > cat a2.py #!/usr/bin/python title = "This is the file a.py!" print title print "importing c from a" import c print "importing b from a" import b print "and b's title is" print b.title Test 2 > ./a2.py This is the file a.py! importing c from a This is the file b.py! This is the file c.py! importing b from a and b's title is This is the file b.py!

We see that c loads b and that b's attributes are visible from a.

What would really help me here is if b could state the call stack of what is importing it.

Test 3

A simple change fails:

> cat b.py #!/usr/bin/python title = "This is the file b.py!" print title print "Called from", __file__

but it does show the effect of byte code compilation:

> ./a.py This is the file a.py! importing b from a This is the file b.py! Called from /home/tdh/python/b.py importing c from a This is the file c.py! > ./a2.py This is the file a.py! importing c from a This is the file b.py! Called from /home/tdh/python/b.pyc This is the file c.py! importing b from a and b's title is This is the file b.py!

I can see the "nesting" if I pop into an interactive session:

> python >>> import a2 This is the file a.py! importing c from a This is the file b.py! Called from b.pyc This is the file c.py! importing b from a and b's title is This is the file b.py! >>> a2.__dict__.keys() ['c', 'b', 'title', '__builtins__', '__file__', '__name__', '__doc__'] >>> a2.c.__dict__.keys() ['b', 'title', '__builtins__', '__file__', '__name__', '__doc__'] >>> a2.c.b.__dict__.keys() ['__builtins__', '__name__', '__file__', '__doc__', 'title']

But this doesn't answer my question of how to figure this out recursively. I.e., I guess I am looking for a parent "pointer" and I could walk it to get my answer.

But I've still learned more than just reading the book linearly.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

The myth of security with AUTH_SYS

Kool Aid Served Daily - Mon, 10/06/2008 - 2:34am

AUTH_SYS is an insecure security mode, yet it is commonly used within companies. It can be used as the proverbial open lock on a door - the fact that the lock is there means do not enter. But I've seen people terminated for ignoring that lock.

With that in mind, I want to go over the simple security schemes employed within a company and show why they don't work. The punchline will be of course Kerberos. Speaking of myths, one is that you need NFSv4 in order to deploy Kerberos. You don't - common servers and clients easily speak Kerberos with NFSv3. And ignore NFSv2, please, please.

Export Security

With an export (or share), the most lax security is typically the default:

[root@pnfs-9-26 ~]> zfs create rootpool/export/home/secure [root@pnfs-9-26 ~]> share [root@pnfs-9-26 ~]> zfs set sharenfs=on rootpool/export/home/secure [root@pnfs-9-26 ~]> share -@rootpool/exp /export/home/secure rw ""

I.e., every machine in the world can mount pnfs-9-26:/export/home/secure. The reasons for this default are simple:

  1. When the model was first developed
    1. the number of exports and clients was limited
    2. the degree of interconnectedness outside of the organization was limited
  2. It is hard to know beforehand where to limit access.

By default, root has access almost like any other user, but it is mapped to the user nobody. We can see this here if we grant wide open permissions on the export:

[root@pnfs-9-26 ~]> chmod 777 /export/home/secure [root@pnfs-9-26 ~]> ls -la /export/home/secure total 6 drwxrwxrwx 2 root root 2 Oct 5 11:39 . drwxr-xr-x 5 th199096 staff 6 Oct 5 11:39 ..

We should be able to create a file as anyone from another machine: [root@jhereg ~]> mount -o vers=3 pnfs-9-26:/export/home/secure /mnt [root@jhereg ~]> touch /mnt/i_am_root

That worked:

[root@pnfs-9-26 secure]> ls -la total 7 drwxrwxrwx 2 root root 3 Oct 5 11:51 . drwxr-xr-x 5 th199096 staff 6 Oct 5 11:39 .. -rw-r--r-- 1 nobody nobody 0 Oct 5 11:51 i_am_root

Notice that root has been mapped to nobody. What happens if we do it as a normal user:

[th199096@jhereg ~]> touch /mnt/i_am_jhereg [th199096@jhereg ~]> touch /mnt/i_am_th199096

And we get the correct user:

[root@pnfs-9-26 secure]> ls -la total 9 drwxrwxrwx 2 root root 5 Oct 5 11:54 . drwxr-xr-x 5 th199096 staff 6 Oct 5 11:39 .. -rw-r--r-- 1 th199096 staff 0 Oct 5 11:54 i_am_jhereg -rw-r--r-- 1 nobody nobody 0 Oct 5 11:51 i_am_root -rw-r--r-- 1 th199096 staff 0 Oct 5 11:54 i_am_th199096

Now what happens if we try to remove i_am_th199096 as root?

[root@jhereg ~]> rm /mnt/i_am_th199096 rm: /mnt/i_am_th199096: override protection 644 (yes/no)? y anon

We are allowed to do that, but is it a property of being root or the permissions? We can check this with a simple change of the share:

[root@pnfs-9-26 secure]> zfs set sharenfs=anon=-1 rootpool/export/home/secure [root@pnfs-9-26 secure]> share -@rootpool/exp /export/home/secure anon=-1 ""

See share_nfs(1M) for a description of anon. Notice I didn't specify whether rw is set or not. We can retry the delete:

[root@jhereg ~]> rm /mnt/i_am_jhereg NFS3 getattr failed for pnfs-9-26: RPC: Authentication error; s1 = 13, s2 = 0 rm: /mnt/i_am_jhereg: Permission denied

If you want to make sure to deny root level access to a share, then you need to set anon=-1.

Conversely, if you want to enable root level access to a share, you can set anon=0:

[root@pnfs-9-26 secure]> zfs set sharenfs=anon=0 rootpool/export/home/secure [root@pnfs-9-26 secure]> share -@rootpool/exp /export/home/secure anon=0 ""

I've recreated the two files in the background (which shows by the way that rw is the default). And when we test the deletion:

[root@jhereg ~]> rm /mnt/i_am_jhereg [root@jhereg ~]>

No pesky question that implies I am not a god!

root=

If I want to allow root access from one host but deny it from all others, I can use the root= access list:

[root@pnfs-9-26 secure]> zfs set sharenfs=root=pnfs-9-25.central.sun.com rootpool/export/home/secure [root@pnfs-9-26 secure]> share -@rootpool/exp /export/home/secure sec=sys,root=pnfs-9-25 ""

PS: The sec=sys is stating this is an AUTH_SYS share. Also, since I am using DNS for hosts in /etc/resolv.conf, I need a FQDN.

Try to remove:

[root@jhereg ~]> rm /mnt/i_am_th199096 rm: /mnt/i_am_th199096: override protection 644 (yes/no)? yes

Since it worked and we got a prompt, it has to be the permission set which is enabling this. If we tighten things down a bit more:

[root@pnfs-9-26 secure]> zfs set sharenfs=root=pnfs-9-25.central.sun.com,anon=-1 rootpool/export/home/secure [root@pnfs-9-26 secure]> share -@rootpool/exp /export/home/secure anon=-1,sec=sys,root=pnfs-9-25 ""

We can see we are locked out:

[root@jhereg ~]> rm /mnt/i_am_root rm: /mnt/i_am_root: Permission denied

versus

[root@pnfs-9-25 ~]> rm /mnt/i_am_root [root@pnfs-9-25 ~]>

And yet the other machine reigns supreme:

We'll revisit the use effectiveness of root= without anon=, when we look at permissions.

rw=

So we can keep machines from getting access altogether by restricting the rw= access list:

[root@pnfs-9-26 ~]> zfs set sharenfs=rw=pnfs-9-25.central.sun.com rootpool/export/home/secure [root@pnfs-9-26 ~]> share -@rootpool/exp /export/home/secure sec=sys,rw=pnfs-9-25.central.sun.com ""

which yields on the two clients:

[root@jhereg ~]> ls -la /mnt /mnt: Permission denied

and

[root@pnfs-9-25 ~]> ls -la /mnt drwxrwxrwx 2 root root 6 Oct 5 19:33 . drwxr-xr-x 36 root root 39 Oct 5 19:11 .. -rw-r--r-- 1 th199096 staff 0 Oct 5 13:30 i_am_here -rw-r--r-- 1 th199096 staff 0 Oct 5 13:27 i_am_pnfs-9-25 -rw-r--r-- 1 th199096 staff 0 Oct 5 13:27 i_am_pnfs_9_25 -rw-r--r-- 1 th199096 staff 0 Oct 5 13:30 i_am_th199096

Note that the client jhereg must be caching a file handle for the root of the export /export/home/secure on the server pnfs-9-26. If it were not, we would have to reissue the mount request, which would have to fail. Also note, it is not just the mountd requests which have to check access list permissions. If it were, then the above operations would always work. SunOS used to work this way and the Solaris NFS team made a change back in the 1995/96 time frame, see for example Brent Callaghan's presentation at the 1996 Connectathon: NFS Client Authentication. And quickly, the security reason for doing so is the implication that if a rogue client someone sniffed out a valid file handle, then it had complete access to all of the information on that share.

ro=

We can likewise grant read only access via the ro= access list.

Access list interactions

All of rw, rw=, ro, and ro= interact as described by sharenfs(1M).

File Permissions

So access lists work on machines. If a machine is able to mount a share from a server, then all users on that client can access everything on that server. Right?

Wrong. The directory and file permissions determine user access. Contrast this with a model derived from a client only having one user logged in at a time. In that situation, it may not be the machine which is important but rather the user..

If I wanted to only grant access to a single user, then I would set the owner of the share to be that user and I would also set the permissions to be 700:

[root@pnfs-9-26 ~]> chown th199096:staff /export/home/secure/ [root@pnfs-9-26 ~]> chmod 700 /export/home/secure/ [root@pnfs-9-26 ~]> ls -la /export/home/secure/ total 10 drwx------ 2 th199096 staff 6 Oct 5 19:33 . drwxr-xr-x 5 th199096 staff 6 Oct 5 11:39 .. -rw-r--r-- 1 th199096 staff 0 Oct 5 13:30 i_am_here -rw-r--r-- 1 th199096 staff 0 Oct 5 13:27 i_am_pnfs-9-25 -rw-r--r-- 1 th199096 staff 0 Oct 5 13:27 i_am_pnfs_9_25 -rw-r--r-- 1 th199096 staff 0 Oct 5 13:30 i_am_th199096

And lets change the share to be wide open:

[root@pnfs-9-26 ~]> zfs set sharenfs=on rootpool/export/home/secure [root@pnfs-9-26 ~]> share -@rootpool/exp /export/home/secure rw ""

We see root access is denied (because it maps to nobody):

[root@pnfs-9-25 ~]> ls -la /mnt /mnt: Permission denied total 3

But on that same machine, th199096 is granted access:

[root@pnfs-9-25 ~]> su - th199096 [th199096@pnfs-9-25 ~]> ls -la /mnt total 12 drwx------ 2 th199096 staff 6 Oct 5 19:33 . drwxr-xr-x 36 root root 39 Oct 5 19:11 .. -rw-r--r-- 1 th199096 staff 0 Oct 5 13:30 i_am_here -rw-r--r-- 1 th199096 staff 0 Oct 5 13:27 i_am_pnfs-9-25 -rw-r--r-- 1 th199096 staff 0 Oct 5 13:27 i_am_pnfs_9_25 -rw-r--r-- 1 th199096 staff 0 Oct 5 13:30 i_am_th199096

By the way, if we grant either root= or anon=0 access, then this all goes out the window:

[root@pnfs-9-26 ~]> zfs set sharenfs=rw,anon=0 rootpool/export/home/secure

yields:

[root@pnfs-9-25 ~]> ls -la /mnt total 12 drwx------ 2 th199096 staff 6 Oct 5 19:33 . drwxr-xr-x 36 root root 39 Oct 5 19:11 .. -rw-r--r-- 1 th199096 staff 0 Oct 5 13:30 i_am_here -rw-r--r-- 1 th199096 staff 0 Oct 5 13:27 i_am_pnfs-9-25 -rw-r--r-- 1 th199096 staff 0 Oct 5 13:27 i_am_pnfs_9_25 -rw-r--r-- 1 th199096 staff 0 Oct 5 13:30 i_am_th199096

A client's root only gets to boss things around if the server grants permission.

The final myth of AUTH_SYS

Take a server for which the root account is locked down. Assume admins who don't want an inadvertent 'rm -rf /net' to nuke their server, so by default they create shares of the form:

[root@pnfs-9-26 ~]> zfs set sharenfs=rw,anon=-1 rootpool/export/home/secure

And further, at some point someone decides to lock down a share's permissions, i.e., 700 on the user th199096.

How long would it take someone to get access over AUTH_SYS?

Not long - even though we know root access is out and we can assume they do not know my password. Since we use NIS, they can do a 'ypcat passwd | grep th199096' and grab my uid. Then they only have to create a dummy account a test machine.

What if we create a special account, not in NIS? Well, they may not have root access on the server, but if they have any access, then they could cd to the parent directory, issue an 'ls -la', see the user name, and then grep for it out of /etc/passwd.

You could lock down the machine, lock down the NIS database, etc. But the fact remains that if I can mount it, then I can create a simple script to try every UID until I get access. How many servers out there check for getattr storms?

The answer is to further restrict the access lists. But eventually, if I'm able to gain access to one of the restricted machines or if I can bring up my box with the same IP as one of the restricted machines, I can get access.

Kerberos

But all I need to do to combat this without all of these "extreme" measures is to enable Kerberos on the server:

[root@pnfs-9-26 ~]> zfs set sharenfs=sec=krb5,rw,anon=-1 rootpool/export/home/secure [root@pnfs-9-26 ~]> share -@rootpool/exp /export/home/secure anon=-1,sec=krb5,rw ""

I am the right user (actually my uid on pnfs-9-25 matches that of the uid of the user th199096 on pnfs-9-26), but it fails:

[th199096@pnfs-9-25 ~]> ls -al /mnt NFS3 access failed for pnfs-9-26: RPC: Authentication error; s1 = 13, s2 = 0 /mnt: Permission denied total 3 Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Beware expectations

Kool Aid Served Daily - Sun, 10/05/2008 - 10:15pm

I call a "share" an "export" because I learned the terminology at another company, one based on the SunOS style and not the Solaris style. It turns out I have other expectations on how shares work. I thought the following was legal:

[root@pnfs-9-24 ~]> zfs set sharenfs=rw=pnfs-9-25:jhereg rootpool/export/home/secure

And all I got was:

[root@pnfs-9-25 ~]> mount pnfs-9-24:/export/home/secure /mnt nfs mount: mount: /mnt: Permission denied

I reinstrumented mountd to spit out some debug messages and I saw:

[root@pnfs-9-24 ~]> Oct 5 16:04:27 pnfs-9-24 mountd[1598]: Considering |pnfs-9-25| vs |pnfs-9-25.Central.Sun.COM| Oct 5 16:04:27 pnfs-9-24 mountd[1598]: Considering |jhereg| vs |pnfs-9-25.Central.Sun.COM| Oct 5 16:04:27 pnfs-9-24 mountd[1598]: Considering |pnfs-9-25| vs |pnfs-9-25.Central.Sun.COM| Oct 5 16:04:27 pnfs-9-24 mountd[1598]: Considering |jhereg| vs |pnfs-9-25.Central.Sun.COM| Oct 5 16:04:27 pnfs-9-24 mountd[1598]: pnfs-9-25.Central.Sun.COM denied access to /export/home/secure

So it never considers the FQDN. Interesting, so what happens if we add it?

[root@pnfs-9-24 ~]> zfs set sharenfs=root=pnfs-9-25.Central.sun.com,anon=-1 rootpool/export/home/secure

We see:

[root@pnfs-9-25 ~]> mount pnfs-9-24:/export/home/secure /mnt [root@pnfs-9-25 ~]>

And on the console:

[root@pnfs-9-24 ~]> Oct 5 16:06:27 pnfs-9-24 mountd[1598]: Considering |pnfs-9-25.Central.sun.com| vs |pnfs-9-25.Central.Sun.COM|

By the way, the compare is case insensitive. This took me way longer to track down than I liked. And it had me going down dead-ends with other "bugs".

The share_nfs(1M) has this to say:

access_list The access_list argument is a colon-separated list whose components may be any number of the following: hostname The name of a host. With a server con- figured for DNS or LDAP naming in the nsswitch "hosts" entry, any hostname must be represented as a fully quali- fied DNS or LDAP name.

And sure enough:

[root@pnfs-9-24 ~]> grep hosts /etc/nsswitch.conf # "hosts:" and "services:" in this file are used only if the #hosts: nis [NOTFOUND=return] files hosts: files dns # before searching the hosts databases.

Besides RTFMing myself, which I had done earlier, but not well enough, I was struck by the thought that I wish we had made this choice at a previous company. It solves a lot of problems, reduces a lot of name server queries (which was many of the problems), but is not as flexible. Consider a multi-homed client thorton which can either be thorton.central.sun.com or thorton.be.central.sun.com. With just rw=thorton, we can leverage the search domains to allow access to both interfaces as once.

But, depending on the ordering in the search domains, we may end up sending more name lookups than we want. Also, I've heard some sysadmins expose the belief that those interfaces represent different machines. And if you want both to have access, you explicitly grant them both access.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Lessons from the Last Crash

NetApps - Sat, 10/04/2008 - 8:03pm

In the past two weeks, I've had lots of people ask me how I think the financial meltdown will affect things. I don't have a crystal ball, but I thought it might be interesting to look back to the tech crash in 2000/2001.

I remember one of our executive staff meetings in particular, where it became clear how bad things were getting. One of the topics of the meeting was how sales were going, and Rob Salmon, who ran world-wide sales at the time, described an ugly picture. We were still winning deals, at least according to the lower level decision makers, but when it came time to collect the purchase order, we would find out that the CFO, or even the CEO, had frozen the funds at the last minute. It wasn't that anyone else was taking the business: the business was simply disappearing, or at least being delayed.

The executive staff meeting continued, and an hour or so later we had a status update on a big business software project that we were working on. I can't remember exactly what it was—ERP or CRM or something like that. Anyway, about twenty minutes into the conversation, Dan, our CEO, interrupted and said, "I don't think it makes sense to do this right now. It's just too expensive, and given the economic situation, too risky." The IT people giving the presentation said, "But it’s been approved. We already committed to our vendor!" Dan's response was, "It's not too late to cancel." That was that.

From the other side of the room, Rob Salmon groaned. He said, "I bet this is exactly the same conversation that is going on at our customers, right before they tell us that the deal we thought we had won just disappeared.

That was in 2001, but I have a hunch that the same conversation is going on in boardrooms all around the country. All around the world, for that matter. Remember, it's only two weeks ago Monday that Lehman collapsed. It seems pretty likely that CEOs and CFOs will reconsider, or at least delay, any big decisions that they can, even if they haven’t told their staff yet.

I’m curious about peoples’ experiences at different companies. What has the boss said so far?

Categories: NFS related news

So the closed binaries are live

Kool Aid Served Daily - Sat, 10/04/2008 - 3:00am

I just announced on nfs41-discuss that the closed-binaries went live! See Mercurial Repository created.

When I configured my community, I followed all of the steps outlined at Setting up a pnfs community except for the mdsadm command on the MDS server:

[root@pnfs-9-10 ~]> mdsadm -o add -t auth -a ip=10.1.233.50 adding: IP Addr - 10.1.233.50

I experienced the following on my MDS console, which are being investigated but do not appear to be fatal:

[root@pnfs-9-14 ~]> Oct 3 17:00:11 pnfs-9-14 /usr/lib/nfs/nfsd[101025]: write failed for /var/nfs/v4_state/mds_/010.001.233.053-6448e6939a: write(669938620) returned -1 errno=14 ss_len=669938600 Oct 3 17:05:55 pnfs-9-14 nfssrv: NOTICE: op_destroy_session: SP4_NONE

And ditto for these on the DS console:

[root@pnfs-9-13 ~]> dservadm enable [root@pnfs-9-13 ~]> Oct 3 16:52:29 pnfs-9-13 dserv[101033]: bad cmd: 3 Oct 3 16:52:29 pnfs-9-13 last message repeated 1 time Oct 3 16:52:32 pnfs-9-13 dserv: WARNING: CLNT_CALL() ds protocol to mds failed: 5 Oct 3 16:52:32 pnfs-9-13 dserv[101033]: ioctl failed: I/O error sahre sahre: Command not found. [root@pnfs-9-13 ~]> share -@data/nfs4 /data/nfs4 anon=0,sec=sys,rw "" [root@pnfs-9-13 ~]> Oct 3 17:00:27 pnfs-9-13 /usr/lib/nfs/nfsd[101019]: write failed for /var/nfs/v4_state/mds_/010.001.233.053-6448e693c4: write(419178265) returned -1 errno=14 ss_len=419178245 Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

New gate and closed bins build a working pNFS community

Kool Aid Served Daily - Fri, 10/03/2008 - 11:05pm

So I have a successful community up and running. I'll push the closed binaries out later tonight. Life impinges...

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Hey, the source browser is up and running

Kool Aid Served Daily - Fri, 10/03/2008 - 10:17pm

Check out nfsv41/nfs41-gate.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

First successful push to nfs41-gate on opensolaris

Kool Aid Served Daily - Fri, 10/03/2008 - 10:15pm

We had the first developer make a real push to the nfs41-gate on the OpenSolaris NFSv41 Project Repository.

Jim Wahlig had this to push:

[thud@adept nfs41-gate]> hg incoming comparing with ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate searching for changes changeset: 7743:c672b1cb86be user: Thomas Haynes date: Thu Oct 02 22:28:30 2008 -0500 summary: Added tag closedv1 for changeset 9fab48a31a4a changeset: 7744:763bfa203d1a tag: tip user: jwahlig@aus-build3 date: Fri Oct 03 11:52:59 2008 -0500 summary: fix stable storage on x86.

The only issue we encountered was that mail did not get accepted for the nfs41-discuss mailing list. I'll have to look at that.

You can also see above the tag I pushed for closedv1.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Building OpenSolaris inside SWAN

Kool Aid Served Daily - Fri, 10/03/2008 - 8:34pm

Okay, building OpenSolaris with the opensolaris.sh environment inside SWAN is different. I first tried it with:

% ws cleanroom % nightly opensolaris.sh

And got garbage. I tried palying with some environment variables and didn't get anywhere. I then tried it with bldenv:

% exit % cd cleanroom % bldenv -d opensolaris.sh % nightly opensolaris.sh

That went fast:

/opt/SUNWspro/bin/dmake dmake: Sun Distributed Make 7.7 2005/10/13 number of concurrent jobs = 36 No 32-bit compiler found *** Error code 1 The following command caused the error: if /builds/th199096/cleanroom/usr/src/tools/proto/opt/onbld/bin/i386/cw -_cc -_versions >/dev/null 2>/dev/null; then \

Finally, I went back to the ws approach and with the following opensolaris.sh diffs:

[th199096@jhereg cleanroom]> diff opensolaris.sh usr/src/tools/env/opensolaris.sh 45c45 < GATE=cleanroom; export GATE --- > GATE=testws; export GATE 48c48 < CODEMGR_WS="/builds/th199096/$GATE"; export CODEMGR_WS --- > CODEMGR_WS="/export/$GATE"; export CODEMGR_WS 91c91 < STAFFER=th199096; export STAFFER --- > STAFFER=nobody; export STAFFER 157c157 < #BUILD_TOOLS=/opt; export BUILD_TOOLS --- > BUILD_TOOLS=/opt; export BUILD_TOOLS 159,161c159,160 < #SPRO_ROOT=/opt/SUNWspro; export SPRO_ROOT < #SPRO_VROOT=$SPRO_ROOT; export SPRO_VROOT < #__SSNEXT=""; export __SSNEXT --- > SPRO_ROOT=/opt/SUNWspro; export SPRO_ROOT > SPRO_VROOT=$SPRO_ROOT; export SPRO_VROOT 186d184

That seems to have worked. Now I need to test a pNFS community setup and run cthon.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Remember to read the README

Kool Aid Served Daily - Fri, 10/03/2008 - 6:16pm

So I have the closed binaries which correspond to the new nfs41-gate up on osol. I grabbed a copy of that source and started a build up. And it failed.

My thoughts were that either:

  1. I hosed the push to osol and thus all of the source was not there.
  2. The recent switch to Sun Studio 12 is impacting me.

The first is justifiable paranoia and the second has happened to me before. So, I searched my blog (more than 51% of why I blog is to have an easy to search repository of tips, tricks, and efdups.) and found this tidbit: RTFR - Or make sure you do read all of the README. Now it wasn't a direct hit, but what the hey, while I'm here I should read that README.

And sure enought, it has something on the compiler switch:

Please note that the compiler that comes with the Solaris Developer Express release is Studio 12, which is not the standard compiler for OpenSolaris code. If you use Studio 12, you will need to set __SSNEXT to the null string in your environment file. Please do report problems with Studio 12, particularly if the problem goes away when you use Studio 11 (the current standard compiler).

I'll rebuild with that change and see if it is a hit or the paranoia is justifiable after all.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

How to get a Mercurial workspace after creating a ZFS clone

Kool Aid Served Daily - Fri, 10/03/2008 - 5:36pm

I wrote about how I didn't know how to mix Mercurial and ZFS data sets together to get a new clone on a new dataset. Dave Marker provided this insight:

zfs create pool/ws/th199096/spe-build cd /pool/ws/th199096/spe-build hg init echo "[paths]" > .hg/hgrc echo "default = ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate" >> .hg/hgrc hg pull -u

The trick is realizing that there is nothing magical about 'hg clone'.

And if at this point I want to do a closed gate, I can use my normal incantation because it wil be on the same dataset.

And I can then create a ZFS snapshot and clone that to my heart's desire.

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Setting up a Development Project Gate

Kool Aid Served Daily - Fri, 10/03/2008 - 7:08am

When we transitioned from TeamWare (tw) to Mercurial (hg), I made several attempts to craft a group workspace. The killer always seemed to be that I had to manually do an 'hg update' in the gate and that was too difficult to remember. In the end, I decided to mimic what the ON gatekeepers were doing. At first, I did it all by brute force, copying everything that they had in place. Eventually, since I didn't know Python, I started asking Dave Marker for help. And boy, did I get some. Anyway, here is what I went through to set up nfs41-gate and nfs41-clone.

Setup a restricted user account

You want to create a restricted user account for a couple of reasons. At first I thought this was to just keep people from sneaking a look at the hgrc files and such, but there is a broader need in that you want to force all writes to the gate to come through a single account. That way you can configure the push process to always go through some sanity checks. You don't want this to be your regular account, because it will restrict your ability to do things. I'll show that to you in a bit.

[nfs4hg@aus1500-home hook]> grep nfs4hg /etc/passwd nfs4hg:x:3530:1813:Mr. NFS4 HG:/pool/nfs4hg:/usr/bin/tcsh [nfs4hg@aus1500-home hook]> grep 1813 /etc/group mhg::1813:th199096

You want a uid and gid which is not in the NIS maps and you want the account to be local to your gate machine.

Setup a ZFS filesystem for your gate and clone

You want to leverage ZFS because snapshots and clones are your safety nets. A snapshot saves you from a bad command and a clone lets you try new things in a sandbox.

zfs create pool/ws/nfs41-gate zfs create pool/ws/nfs41-clone chown nfs4hg:mhg /pool/ws/nfs41-gate /pool/ws/nfs41-clone Populate your gate and clone

Be sure to login as your restricted user:

warlock % ssh aus1500-home aus1500-home % su - nfs4hg

Get used to doing it this way - you won't be able to ssh directly as nfs4hg before too long.

I'm going to assume a new branch off of onnv-gate. If you have an existing tw or hg workspace, you can substitute in the relevant commands to create the gate. I never want to speak of migrating TeamWare to Mercurial.

I don't know why, but I have to do something like:

cd /pool/ws/nfs41-gate hg clone ssh://anon@hg.opensolaris.org/hg/onnv/onnv-gate mv onnv-gate/.hg* . mv onnv-gate/usr . rm -rf onnv-gate

I find Mercurial doesn't like it if the target directory already exists. If you are inside SWAN, make sure to also get the closed bits:

cd usr hg clone ssh://anon@onnv.eng/export/onnv-clone/usr/closed

And now make your clone off of your new gate. You want to make sure that the hg paths are correct:

[nfs4hg@aus1500-home nfs41-clone]> hg paths default = /pool/ws/nfs41-gate Retrieve the gatekeeping Python extensions

You can get these via:

[thud@adept ~/foo]> hg clone ssh://anon@hg.opensolaris.org/hg/scm-migration/onnv-gk-tools destination directory: onnv-gk-tools requesting all changes adding changesets adding manifests adding file changes added 40 changesets with 171 changes to 50 files 35 files updated, 0 files merged, 0 files removed, 0 files unresolved

The onnv-gk-tools are the heart of setting up your project gate. For the nfs41-gate, I put them outside of both the gate and the clone:

[nfs4hg@aus1500-home ws]> zfs list | grep onnv-gk pool/onnv-gk-tools 630K 4.31T 630K /pool/onnv-gk-tools

Again, leverage ZFS for things like this tool set.

Read onnv-gk-tools/README at this point. I may have done things from there and forgotten to mention them here.

Copy hgrc files to gate and clone

You'll want to copy the existing hgrc files over to your gate and clone:

cp gate-hgrc /pool/ws/nfs41-gate/.hg/hgrc cp gate-closed-hgrc /pool/ws/nfs41-gate/usr/closed/.hg/hgrc cp clone-hgrc /pool/ws/nfs41-clone/.hg/hgrc cp clone-closed-hgrc /pool/ws/nfs41-clone/usr/closed/.hg/hgrc Set permissions on the gate and clone

I can't say this enough, from the README:

gate enforcement: Mercurial only lets you pull if you can read {REPO}/.hg Mercurial only lets you push if you can write {REPO}/.hg So {GATE} and {GATE}/usr/closed are owned by onhg/gk Mode is set to 0770 {CLONE} and {CLONE}/usr/closed are also owned by onhg/gk But mode is set to 0775 Comment out temp hack in hook/notify.py

So, this should be configurable, but for now you want to comment out the following to avoid spamming a mailing list:

[nfs4hg@aus1500-home onnv-gk-tools]> diff hook/notify.py ~/onnv-gk-tools/hook/notify.py 80c80 < # m.msg["Bcc"] = "onnv-flagdays@onnv.eng" --- > m.msg["Bcc"] = "onnv-flagdays@onnv.eng"

Note that you want to make sure the '#' is added right where the 'm' was - I hear Python really cares about indentation.

Copy on-hg.py to the homedir

You'll want to copy this file to the restricted account's homedir and rename it as well. You don't want to inadvertently refer to the ON one at any point:

cp on-hg.py ~/nfs4-hg.py Edit the new etc/config.py file

This is the step that configures Mercurial to understand your gate.

I'm not going to step through the changes, I feel they are explanatory. Note though that later we will see that some of the Python scripts do not make use of parts of this file. I.e., GATE_USER could be used in the hook/updateoso.py file.

Hmm, and so far, all of the user accounts and paths needed for these changes exist.

[nfs4hg@aus1500-home onnv-gk-tools]> diff etc/config.py ~/onnv-gk-tools/etc/config.py 87c87 < GATE_NAME = "nfs41" --- > GATE_NAME = "onnv" 89c89 < GATE_WS = "/pool/ws/%s-gate" % (GATE_NAME) --- > GATE_WS = "/ws/%s-gate" % (GATE_NAME) 91c91 < CLONE_WS = "/pool/ws/%s-clone" % (GATE_NAME) --- > CLONE_WS = "/ws/%s-clone" % (GATE_NAME) 94c94 < GATE_DIR = "/pool/ws/nfs41-gate" --- > GATE_DIR = "/export/onnv-gate" 96c96 < CLONE_DIR = "/pool/ws/nfs41-clone" --- > CLONE_DIR = "/export/onnv-clone" 99,104c99,104 < GATE_HOST = "aus1500-home" < GATE_ALTHOST = "aus1500-home" < GATE_HOST_X = "aus1500-home" < GATE_HOST_S = "aus1500-home" < GATE_DOMAIN = "central" < GATE_MAIL = "aus1500-home.central" --- > GATE_HOST = "elpaso" > GATE_ALTHOST = "juarez" > GATE_HOST_X = "elpaso" > GATE_HOST_S = "juarez" > GATE_DOMAIN = "sfbay" > GATE_MAIL = "onnv.eng" 106,110c106,110 < GATEKEEPER = "th199096" < ASSTGATEKEEPER = "rmesta" < TECHLEAD = "th199096" < ASSTTECHLEAD = "rmesta" < CTEAMLEAD = "webaker" --- > GATEKEEPER = "dm120769" > ASSTGATEKEEPER = "suha" > TECHLEAD = "jbeck" > ASSTTECHLEAD = "nickto" > CTEAMLEAD = "muolla" 112,113c112,113 < ALIAS_GK = "th199096@%s" % (GATE_MAIL) < ALIAS_GATEKEEPER = "th199096@%s" % (GATE_MAIL) --- > ALIAS_GK = "gk@%s" % (GATE_MAIL) > ALIAS_GATEKEEPER = "gatekeeper@%s" % (GATE_MAIL) 115,116c115,116 < GATE_USER = "nfs4hg" < GATE_GROUP = "mhg" --- > GATE_USER = "onhg" > GATE_GROUP = "gk" 118,119c118,119 < SNAPS_DIR = "/pool/ws/snapshot" < BUILDS_DIR = "/pool/ws/builds" --- > SNAPS_DIR = "/export/snapshot" > BUILDS_DIR = "/export/builds" Edit the hgrc files

Now we go back and edit the hgrc files for the various pieces. These modifications tell the gate how to interact with the clone, etc.

Gate's hgrc

I will annotate these changes:

[nfs4hg@aus1500-home .hg]> diff hgrc ~/onnv-gk-tools/gate-hgrc 17c17 < hook = /pool/onnv-gk-tools/hook --- > hook = /export/onnv-gate/public/python/hook

Okay, we need to tell the gate where our config.py file is and how to use the extensions. The above does that. Note that if we do not make this change, we could impact ON.

20c20 < gatename = nfs41-gate --- > gatename = onnv-gate 23c23 < wlock = nfs4hg, th199096 --- > wlock = onhg, dm120769, suha 27,28c27,28 < recv = pnfs-core@sun.com < #logmail = onnv-gate-putback-log@onnv.eng --- > recv = onnv-gate-notify@onnv.eng > logmail = onnv-gate-putback-log@onnv.eng 32,33c32,33 < recv = thomas.haynes@sun.com < rti = False --- > recv = onnv-putback-diffs@onnv.eng > rti = True

With a development gate, you bypass the RTI process. So, we should bypass the checking for it.

36,38d35 < [web] < baseurl = http://aus1500-home.central < 40c37 < url = http://aus1500-home.central/pool/ws/nfs41-gate --- > url = http://onnv.sfbay/net/onnv.sfbay 43c40 < temp = /pool/nfs4hg/webrev --- > temp = /space/webrev

Ah, we will want to create this directory. I understand that you want this directory to have parents that do not have ".hg/" or "Codemgr_wsdata/". If there is even one with a subdirectory of with these names, it will mess things up.

45,48c42,49 < #[rti] < #webrticli = /net/webrti/export/home/bin/webrticli < #url = http://webrti.sfbay/rti/xml/index.php < #project = on --- > # advocate is only set for restricted builds. > # When set only those listed (separated by commas) are valid RTI advocates. > # Any others used will cause a rollback from rti.py > [rti] > webrticli = /ws/onnv-gate/public/bin/webrticli > url = http://webrti.sfbay/rti/xml/index.php > project = on > #advocate = John (dot) Beck (at) sun (DOT) com

We really, really want to bypass RTI checking and John really doesn't want to be spammed by this. (And this is the only place I changed the source.)

51c52 < comchk = False --- > comchk = True

We know a comments in a development gate are not going to be valid, so do no checks.

74c75 < #pretxnchangegroup.2 = python:hook.rti.rti --- > pretxnchangegroup.2 = python:hook.rti.rti

Again, no RTI at all!

82c83 < changegroup.0 = /usr/bin/hg push -R /pool/ws/nfs41-gate /pool/ws/nfs41-clone --- > changegroup.0 = /usr/bin/hg push -R /export/onnv-gate /export/onnv-clone 91d91

Ahh, we should get the above out of etc/config.py, no?

BTW: The two lines I care about most here are:

changegroup.0 = /usr/bin/hg push -R /pool/ws/nfs41-gate /pool/ws/nfs41-clone changegroup.1 = /usr/bin/hg update

Basically, after a push occurs, first push that change to the clone and then run update. See, I don't want to be doing that manually!

Also, note that because of the following lines:

[gatehooks] gatename = nfs41-gate logdir = public/log lockdir = public/lock

You will want to create:

cd /pool/ws/nfs41-gate mkdir -p public/log mkdir public/lock Gate's closed hgrc

All of the above changes apply, but the only real diff is the following:

81,82c81 < # push to hg.os.o will be done out of cron. < changegroup.0 = /usr/bin/hg push -R /pool/ws/nfs41-gate/usr/closed /pool/ws/nfs41-clone/usr/closed --- > changegroup.0 = /usr/bin/hg push -R /export/onnv-gate/usr/closed /export/onnv-clone/usr/closed 91d89

And again, the changes will automatically occur to the clone.

Clone's hgrc

This one is much simpler, mainly because there is not much there:

[nfs4hg@aus1500-home .hg]> cd /pool/ws/nfs41-clone/.hg [nfs4hg@aus1500-home .hg]> diff hgrc ~/onnv-gk-tools/clone-hgrc 12c12 < default = /pool/ws/nfs41-gate --- > default = /export/onnv-gate

We tell the clone where the parent is located.

15c15 < hook = /pool/onnv-gk-tools/hook --- > hook = /export/onnv-gate/public/python/hook 19c19 < gate = file:/pool/ws/nfs41-gate --- > gate = file:/export/onnv-gate 36d35

Where are the hooks and the gate? All of this should be in etc/config.py.

BTW an important line here is:

# These hooks are run from bghook() in the background bg-changegroup.0 = python:hook.updateoso.updateoso

When the clone gets updated, then we will push a change out to OpenSolaris!

If you don't have a repository out there, shame on you! Well, just comment out this line.

Clone's closed hgrc

Exact same diffs as above.

Understanding some things in the hgrcs

A big difference between the gate and the clone is in the hgrcs. The gate is write only and the clone is read only.

So the clone has to prevent writes before they occur. This line does that:

# This prevents boneheaded gatekeepers and gives a more useful message # to gatelings who trust our hooks. prechangegroup.0 = python:hook.cloneincoming.cloneincoming

And I haven't figured out how the gate keeps people from reading. Note, yes I have, see onnv-gk-tools/README. So part of the above may be wrong....

Modify that file in the homedir

These appear pretty self-explanatory:

[nfs4hg@aus1500-home ~]> diff nfs4-hg.py onnv-gk-tools/on-hg.py 45c45 < HGLOGIN = "nfs4hg" --- > HGLOGIN = "onhg" 54,55c54,55 < "/pool/ws/nfs41-gate", < "/pool/ws/nfs41-gate/usr/closed", --- > "/export/onnv-gate", > "/export/onnv-gate/usr/closed", Configuring ssh access

Okay, we almost have everything done that I remember. At this point, you need to start sending emails to your developers for them to send you in their SSH public keys -- see opensolaris.org SSH key help. They need to do this for ON anyway.

Once you get them, then you will add them to the ~/.ssh/authorized_keys of your restricted account. The format of each entry will be:

command="~/nfs4-hg.py 'th199096 ' ",no-port-forwarding,no-X11-forwarding,no-agent-forwarding [the contents of their id_ds.pub file]

You will have one per user.

The format needed here is discussed in onnv-gk-tools/README.

Disclaimer

I did all of this a month or so ago. I am reconstructing what I did. I may have missed some steps.

All of the steps reported here are mine. All mention of possible bugs is my opinion.

Notes Cron jobs

The only cron job I have running is:

[nfs4hg@aus1500-home ~/onnv-gk-tools]> crontab -l 7 3 * * * /pool/ws/scripts/buildtags.sh /pool/ws/nfs41-clone/developer.sh

This will rebuild the cscope and tags databases in the clone. I could do this in another clone, but I like it occurring in a well known place. I do not want it in the gate.

Automatic push to OpenSolaris

I haven't provided the details on how to configure an automatic push to OpenSolaris...

A good link for jumping off: How to Use Mercurial (hg) Repositories

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

How to tie our closed-bins to the new gate

Kool Aid Served Daily - Fri, 10/03/2008 - 4:41am

I was trying to relax and I realized we would have an ongoing problem in keeping the new ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate in sync with our copy of the closed binaries. But, I think we will be saved by a couple of things:

  1. We don't update the internal nfs41-gate automatically with every change in the onnv-gate. We actually normally sync up with the 2 week releases. This means that random changes to the closed source will not impact the osol gate. As a matter of fact, we control when a change causes a respin of the closed bits.
  2. Just like the ON gatekeepers tag their gate every 2 weeks, we could also use 'hg tag' to mark when the closed binaries changed. We could store the closed binaries on the project download page and when an external developer saw the tag change, they could then pick up a new copy

Plus with setting the mail to go out to the dev mailing list, people would be able to see a need to pickup a new set of closed binaries.

[thud@adept src]> hg incoming comparing with ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate searching for changes changeset: 7743:c672b1cb86be tag: tip user: Thomas Haynes date: Thu Oct 02 22:28:30 2008 -0500 summary: Added tag closedv1 for changeset 9fab48a31a4a Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Juggling work load

Kool Aid Served Daily - Fri, 10/03/2008 - 3:32am

The group has so much to do and it feels like so little time to do it. I don't think anyone just codes. I'm looking at my action list and it is all over the place:

Test fix for 6738223 Can not share a single IP address
Unit testing was successful, but mini-PIT testing is mixed. I think this is more a configuration issue than a code issue.
Push code review along for 6751438 mirror mounted mountpoints panic when umounted
Frank gave me a good review/discussion, but I need another reviewer.
Figure out how to configure a minimal jumpstart
I could ask someone to do it, but I've been meaning to understand jumpstarting myself. This is an easy way to ease into it.
Open up an OpenSolaris gate for NFSv41
Chewed up a large chunk of my day. Hey, I need to do a test integration to see if it works. If it does, I have to run because Dave Marker is going to eat my beating heart. And it works! Here is my Linux box at home: [thud@adept src]> hg incoming comparing with ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate searching for changes changeset: 7742:9fab48a31a4a tag: tip user: Thomas Haynes date: Thu Oct 02 21:19:03 2008 -0500 summary: Test of push to osol [thud@adept src]> hg pull -u pulling from ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate searching for changes adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files 1 files updated, 0 files merged, 0 files removed, 0 files unresolved
Build closed-binaries for OpenSolaris
Trivial, but time consuming. I will need to also install and test. We also just got rid of auth records, so I need to see how we are changing the configuration of the DS and MDS.
Figure out how to translate data path to guuid
The above mentioned server change really hosed up my progress on spe. I've got the final pieces in my mind, but I need to just grind it all out. And then I need to start testing. I may need to go to VirtualBoxes to get enough DSes for testing.
D'oh, need to get a VirtualBox image together for OpenSolaris
I can piggyback on the push of the closed binaries.
Modernize style of blog site
Looks dated and I want to add a tag cloud. At least one of the shared styles will do this, but I just don't want to use it.
Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news

Setting up a NFS41 gate

Kool Aid Served Daily - Fri, 10/03/2008 - 3:00am

We just opened up a new Mercurial gate of NFSv41 on OpenSolaris.org. Eventually it will automatically push changes as they occur to our gate. I also need to figure out a way to automatically update the closed-bins.

The hardest part was figuring out the naming convention. Some links of interest are Some work on libMicro; Mercurial transition notes and finally How to Use Mercurial (hg) Repositories. Look for For Project Leads: How to set up a Mercurial repository.

Update: Also, SCMVolunteers, look for Setting up a new (Mercurial) Project repository on OpenSolaris.org.

In any event, you can grab a copy of the source at:

hg clone ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate

Note the lack of a double '/' after the FQDN - normally I would take that as a sign of a bug with Mercurial.

Note that while this compiles, you can't run it without a corresponding closed-bins.

Eventually, you should be able to browse the source via Cross Reference: nfs41-gate.

And a big thanks to David Marker for providing the help necessary to getting this to go live!

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Categories: NFS related news
Syndicate content