Dave,
Thanks for your comments. I'd like to respond in detail.
I'm sorry, but this is quite long, but it does serve to explain
much of how we arrived at the current Sense reference design
and implementation.
----------------------------------------------------------------------
> My list of requirements for an event notification service is a bit
> shorter than the published SENSE requirements document. Part of
> this is because I'm still trying to envision the abstract service,
> and the SENSE requirements drill down much further. Perhaps more
> importantly, it is also because I'm looking for a service that is
> scalable from very simple implementations on up, and the SENSE
> requirements focus on the high end.
>> My starting point is the idea of a notification service that, at its
> simplest, allows reliable, efficient event notification from an
> agent to a client. In this form, the service should be very
> lightweight to implement. On top of this, this basic service should
> be designed so it can be used as a building block to construct large
> scale services. And I think we should be able to do this in such a
> way that the scaling is transparent to both agent and client.
>> To be more specific, I'd like to see a simple agent/client service
> that can also be rewired as an agent/proxy/client service without
> change to the agent and client.
Agreed. Couldn't state it better myself. To reach this, we must be
aware--from the very start--of which aspects of scalability we need
to address.
I'd like to talk about Sense and scalability for a few moments.
The first scalability issue we addressed was the maximum number of
available TCP connections in small embedded agent environments;
hence, the requirement for datagrams (UDP) over streams (TCP).
The second scaling factor we recognized was number of maximum
number of client "sessions" (or registrations) a single agent
could be expected to handle. Again, embedded agents may only
be able to support a relatively small number, quite possibly
fewer than the number of clients expected to request services
of the agent within an enterprise environment.
However, we didn't want the client to have to deal with issues
of agent limitations in this regard; hence, the early adoption
of a 3-tiered model with a central Server, in which the Server
could be easily run on scalable hardware (ie, a general host
system, such as NT, Unix, NetWare, etc). That way, agents (called
"Publishers" in Sense) can immediately be built with the up-front
expectation of being small in terms of capacity for clients
(called "Subscribers" in Sense).
Of course, there is no reason why an embedded system can't
function as both Server and Publisher (or multiple Publishers).
The real value in the use and concept of the Server is to allow
the CUSTOMER the ability to centralize potentially large numbers
of printer definitions in a single service, allowing easier
discovery by clients looking for those objects (printers).
An implementation goal of Sense is to allow the implementer
to easily choose the approach (Publisher only, or Server/Publisher)
such that implementing the Server component was not an excessively
large amount of effort over the simple Publisher-only side. Of
course, the resource requirements would be quite a bit higher
in the combined Server/Publisher case, but the additional code
required would be modest.
For scalability of implementation, we mandated the use of ANSI C
and Berkeley (BSD) sockets as the *SOLE* underlying technology;
the entire reference implementation has been coded with these
constraints, with the hope that the code base could be easily
adopted by embedded system developers, as well as host-based
developers.
This philosophy serves to support another key goal of Sense,
namely, rapid and pervasive deployment--which may be viewed
as "scalability of deployment". While we don't believe Sense
has a high critical mass factor (ie, pervasive deployment is
not necessary to gain substantial value from Sense), everyone
stands to gain big time if it is integrated with most components
of the enterprise network printing environment.
It was always our hope that printer vendors would take the
approach of only implementing the Publisher(s) within the
printer's embedded environment, based on the notion that a
public domain Sense Server would be available for public
distribution (either bundled with the printer software, or
available free on the Web). That way vendors could minimize
their development time/cost to get into the Sense arena.
----------------------------------------------------------------------
> Now bear with me, because I'm not an expert in event notification
> services (I just play one on mailing lists), but here's my edit on
> requirements:
>> A. Schema based
> The idea being that there is an overall grouping of the
> information for each class of event producer, much like a
> directory entry schema.
Yes, event sources (collections of events for which a client has
interest) must have some kind of distinguished name space in
which clients can query. The design does not address the "class
of event producer" as you mention; rather it is defined in terms
of "class of event" described in terms of:
- Format (syntax)
- Content (semantics)
- Periodicity (frequency)
An instance of an event class is called an "Edition" within Sense.
Where the "Publication" would be used to model the target entity
(for example, a printer), an "Edition" is used to model the source
of a single defined event stream associated with a given Publication.
Another key assumption made early on was that a "producer" could
very easily produce multiple instances of various event classes.
For example, the same producer (Publisher) might produce (publish)
an event class (Edition) having to do with device faults
(possibly geared around the Printer MIB), while at the same time
publish an Edition emitting accounting or other resource usage
records.
This is why we derived the two-level "Publication/Edition" model.
We figured a client had two primary discovery jobs:
1. Locate target objects (eg, printers)
2. Locate compatible event streams associated with the object
(ie, event streams for which the client was able to process)
I'd like to discuss the nature and motivation for the two-level
Publication/Edition model in more detail in a future message,
as this design approach still seems to confuse some folks.
----------------------------------------------------------------------
> B. Reliable receipt of event messages
> I believe this implies client registration for notification (and
> renewal and cancellation), client acknowledgement of receipt,
> and retransmission by the agent.
>> C. Efficient delivery of event messages
> I believe this implies client registration, asynchronous
> delivery of messages by the agent.
Agreed. Again, these features might be a bit much to support
large numbers of clients within an embedded application. That
is why we felt it necessary to provide a scalable Server component
to support large numbers of clients--and therefore ease the burden
of both the vendor and the customer.
----------------------------------------------------------------------
> D. Allows lightweight implementation
> (Okay, I'm going to drill down a little here.) I believe this
> means the agent gets to decide the contents of messages (i.e.
> the basic agent is not required to do client-specified
> filtering)
Exactly. Again, couldn't agree with you more. As I mentioned in
Boulder, we explored the rather traditional avenue of "event
filter specifications" and quickly came away with the same opinion.
Allow the agent (the producer, or Publisher) to dictate the set of
events delivered to interested clients (ie, Subscribers); then,
let the client perform filtering as needed. While this approach
does not fully optimize network bandwidth, we feel the overall
Sense specification would be quite a bit simpler without introducing
an event filter specifications. Such specifications often result
in the development of full-blown languages in their own right, and
this was seen as a potentially large barrier to entry for developers.
Letting the agent (Publisher) side dicate the granularity and number
of event streams (Editions) was considered as an adequate compromise
between resource efficiency and simplicity. I realize, though, that
some may argue the opposite (and quite vigorously, too ;-).
So far, though, in our many client implementations, there doesn't
seem to be a lot of "content fat" in taking this approach. That is,
a client rarely disregards events from subscribed Editions.
----------------------------------------------------------------------
> E. Scalability
> I believe the way to do this is by introducing proxy servers,
> so the basic protocol needs to be able to deal with the issues
> involved in ensuring the liveness of copied data.
This is precisely why the "Server" is so clearly defined as being
separate from the "Publisher" in the Sense spec, to be able to
handle "proxy-like" capabilities on behalf of potentially large
numbers of Publishers.
As I described in Boulder, Sense is defined to provide a degree
of "disconnected service", whereby a client can obtain information
about the target object without necessitating involvement with the
object's "agent" (particularly important for embedded agents).
(This same concept is embodied in the DMI specification, too.)
Publications (and Editions, for that matter) possess a number of
standard properties that allow a client to easily determine whether
a Publisher is currently "live", as well as the last time the
Publication or Edition was updated by its corresponding Publisher.
----------------------------------------------------------------------
> (Drilling down again; revisiting message filtering.) I wonder
> if a notion of message schema variants might provide an
> alternative to client-defined message filtering. The idea being
> that an agent (proxy, in particular) might define and deliver
> "subset" variants in addition to the full message; by selecting
> a variant, a client could reduce the number of irrelevant
> notifications it received.
The "message schema variants" are represented as Editions of different
classes, where the classes are defined in a hierarchical name space.
For example, a Publisher may provide an Edition that emits "device
alerts", where the device alerts all have the same format (syntax);
this Edition might have the class named:
DeviceAlerts
The Publisher may also provide a number of Editions that represent
content subsets of device alerts; for example, to differentiate
"service alerts" from, say, "consumables alerts". In this case,
the Edition classes might be named:
DeviceAlerts.Consumables
DeviceAlerts.FieldService
Our goal was to be able to provide for this kind of granularity
for various event types and environments.
----------------------------------------------------------------------
> F. Client discovery of services
> The client needs to be able to query an agent to determine
> available event notification services, message schemas, and so
> forth.
Yes, some sort of simple directory service is needed to query
the supported services. This is a primary role of the Server;
the Publisher, on the other hand, has no need to support such
a directory service, thereby simplifying the design and
implementation of the Publisher.
----------------------------------------------------------------------
> The client also needs to be able to locate agents, and locating
> proxy agents should be a natural part of the service.
A motivating reason for the Server was to provide a large repository
of information, making it easier for the client (Subscriber) to locate
interesting objects (Publications and Editions), and to be able to
support a large number of Publishers.
However, as I mentioned in Boulder, we expressly ignored requirements
for locating Sense Servers. It was our hope that the scalability of
the Server would allow a customer to maintain a very small number of
Servers--quite possibly only a single Server--thereby making the
Server discovery process trivial. (For example, the addresses of
known Servers would be stored in well-known files or system registry
entries, based on the underlying server host's architecture, etc.)
----------------------------------------------------------------------
> G. Leverages existing technology
> (I'm on thin ice here.) To me, this all looks very much like a
> read-only dynamic directory service, and I wonder if we can't
> model it as a variant on LDAP. (Rationale similar to that for
> basing IPP on HTTP.)
>> There is (was?) an internet draft, LDAP: Extensions for Dynamic
> Directory Services (draft-ietf-asid-ldapv3ext-02), authored by
> Yaacovi, Settle, and Genovese of Microsoft, and Wahl of Critical
> Angle, that might be relevant, although I have no idea of its
> current status or support. Also, the roles it specifies for
> clients and servers don't seem to map directly to what is needed
> here.
Ah yes, why not LDAP? And why not Windows/NT registry? And let's
not forget Novell's NDS. And so on, and so on...
We have always been in favor of leveraging existing technology...with
one small requirement, though:
The leveraged technology must be pervasively available, ideally
being part and parcel (ie, integrated AND bundled) with the
underlying operating system such that *everyone* has the same
basic (useful) set of services.
when Unixdom, Redmond and Provo come together and offer the world
consistent, interoperable directory services.
And that day will be at least ten years BEFORE those same players
provide consistent, interoperable object services...such as seen
in the everlasting CORBA vs. DCOM battle.
And while not as formidable, we really question how quickly such
standard implementations of Agent-X and SNMPv3 will be readily
available such that printer and software vendors can safely rely
on their existence within the typical customer environment.
We don't get good vibes on this being likely in the near future,
either. We certainly wish this wasn't the case, believe me.
But I digress... This topic is worth of a separate thread on
the Sense mailing list: "How long to wait for unified services?"
----------------------------------------------------------------------
> I. Security, passage through firewalls
> No bright ideas here, but these issues need answers. An ugly
> but practial approach might be some combination of polling and
> encapsulation of information over HTTP for remote notification.
Oh, now you had to go and use that "F" word: firewall
Sense has been designed from day #1 as an intranet (ie, enterprise)
facility. While I'm not saying it can't be used across firewalls--
after all, all one has to do is poke another well-defined hole in
the firewall, if necessary--we just didn't address it.
Given the trials and tribulations in IPP, perhaps now everyone
can see why we chose to avoid this particular topic. We can
talk about it, but it's not going to be easy to solve in a generic
way. And we are *loath* to hold up the rest of Sense just because
it doesn't stand up to "firewall standards" (whatever those are).
----------------------------------------------------------------------
> Overall, I'd say these requirements look a lot like the existing
> SENSE requirements. Perhaps the biggest difference is that I would
> like to align with services that have evolved since the design of
> SENSE -- for both engineering and "mind share" reasons.
I beg everyone to seriously consider my previous comments regarding
the need for pervasive deployment of required underlying technology.
If we can demonstrate a real, workable turnkey implementation that
satisfies all the specified requirements, then we should run with it,
even if it doesn't require all the many pieces of "standard" service
technology that are expected to be aligned and available by the year
2014.
Later, when all the "real" and "long-term" component solutions
become commonly available, we can alter Sense to take advantage
of them. However, if we feel we must wait for such technology
to come into existence, then we might as well put Sense on the
shelf for several years to come.
Our reference implementation has shown that we don't have to wait.
It's here now, and it appears to work. And, it can be shown to be
easily portable across all the major server environments...with no
special software components required (since all servers now have IP
stacks, thank heavens).
----------------------------------------------------------------------
> Key issues that I think need to be revisited include:
>> A. Delivery mechanism
> The datagram-based delivery is lightweight (a big advantage),
> but it has size limits and security and firewall problems. Is
> there a better choice?
Are you still of the mind that "size matters"? ;-)
Seriously, we put a big stake in the ground by saying we would
accept the limitation that any single Sense message (in particular,
an event message) was limited to size of a single UDP packet (8KB).
In all the many events generated within our reference Publishers
(including an RFC 1759 Publisher), the UDP size limit has rarely
been found to be a problem, even with a Server providing access
to as many as 800 printers.
> B. Message format
> SENSE tries to be content neutral; I'd nail this down more. And
> of course schemas need to be defined for specific classes of
> event producers.
Absolutely. We have some proposals on the table (for which we have
since implemented), but the number of definitions is boundless,
limited only by the time we need to agree on their definitions.
(Yes, much like a MIB, but hopefully less daunting.)
Hope all of this spawns some serious questions and discussions.
...jay
PS: Security? Did someone mention security?? Yet another topic...
----------------------------------------------------------------------
-- JK Martin | Email: jkm at underscore.com --
-- Underscore, Inc. | Voice: (603) 889-7000 --
-- 41C Sagamore Park Road | Fax: (603) 889-2699 --
-- Hudson, NH 03051-4915 | Web: http://www.underscore.com --
----------------------------------------------------------------------