Category: TWC Meeting Notes

Over Thanksgiving Break, John had let us know about an opportunity that Evan and Deborah had to work on the Wine Agent project, to migrate it to iPad. After asking about what it would involve, which should be primarily UI work using the iOS SDK, I signed up to do this. It looks like a really neat chance for a few reasons. First, I have mostly been doing individual web development projects with SPARQL/HTML and PHP or Python, but they haven’t really been part of an actual project or group which made me feel sort of like I wasn’t contributing much. The ISWC demo was a lot of fun because it was actually needed and also built on that previous work, which made me feel a lot better about it. This is similar, and it is part of an actual project group, whereas even with the ISWC demo it was unassociated with a group and I wrote it independently. Also, it is software application development which I have less experience with and had hoped to work with in my time at the TWC to be more well-rounded, and especially because I wanted to do software development after I graduate. It is also in Objective-C, so in languages too it will help me have a broader foundation. Finally, it means that I won’t feel like I’m totally wasting my time while on vacation, since I’ll be working on this during it!

Of course, working on iOS code to get the Mobile Wine Agent onto an iPad means that I should probably be using a Mac to do it. Seeing as how I’ve never used a Mac before, even this will be an interesting experience since I can fiddle with both the MacBook and the iPad as I work with them, and it was another reason why it sounded interesting (like the best trial demo ever…plus coding project). Most of the last few days were spent fiddling with preferences and getting them set up which is always fun. My first impression of the iPad is just that it’s an iPhone with a few more features but too big to be more portable than a laptop, but it is definitely really useful if you want to do something very quickly; I’ve used it to check my mail using Safari on days where I don’t get to my computer until much later, so I think it should have had more apps that went along with this like news or weather or ones that make best use of the medium size, like eBooks (I grabbed the free iBooks app just in case). The MacBook is really nice, the only issue was that the different style of trackpad/mouse/keyboard keep messing me up (using that funny symbol button instead of ctrl = annoying), but the main ‘wow’ was the battery life, it jumped around a bunch probably because it’s new, but it looks like 5-6 hours from full, whereas my Thinkpad started at 3 and a bit when it was new (now it’s down to a little less than 2…).

Today was spent getting the existing codebase to work on the MacBook and to sync correctly to the iPad since that is, again, rather important. It didn’t take long to get Xcode installed on the computer and just a bit of fiddling with it to connect that to the SVN repository, but I was stalled for a while on two errors when I tried to build and run the code on a simulator. One was just certificate and profile issues, which I fixed with Evan’s help, but the other was an SDK issue, where it kept telling me that it had no base SDK, even though I had gone into the project settings and put in the just installed SDK. As it turns out, there was some sort of error where Xcode somehow looked like the settings were fine, but it had not affected the actual configuration file so the build continued to fail. With all of that finally working, I tried to move on to getting it to run on the iPad instead of the simulator, which again caused some certification and registration issues that Evan helped me with.

Now that I know the current iPhone code is building and running in the simulator and iPad correctly, the next step is to learn and acquaint myself with Objective-C and how iOS apps work then work on the actual work to get the code to work on iPad as well. Just looking at some files briefly when I was trying to figure out why it wasn’t building and then seeing how the current code behaves on the iPad, I can see some things that I definitely need to figure out. First, I need to see what kind of code is needed to differentiate what the app is running on. In some cases, they can use the same code, but in others I need different behaviors depending on what the app is running on. Once I know this, then I can look at some considerations for what to actually implement or change. Size factors are the most obvious issue, but more than that is what to use that space for. One idea might be to change the property addition page so that instead of moving to a new screen with a scrollbar, to see if there is a way to divide the screen, so that the user can see the current properties on the left, and add new ones on the left. Currently, this is done in two screens because of size, and so this would probably involve a lot of changes to the behavior of the first one and some sort of new split screen. An easier way might just to have the second page display the current properties so that you can see that, but otherwise behave the same. The choice will probably be down to aesthetics and I’ll iron out how to do that later. Another really obvious and even more important thing that has to be fixed is the fact that the iPhone display does not rotate, but the iPad version must be able to handle any kind of orientation, which will affect size and what is displayed and how it is displayed. I need to look up how this is done and see how to incorporate that into the code as well. This will likely play havoc with the layouts, so I should probably figure this out second, after seeing how the device can be differentiated, with the size third and the aesthetic stuff last and the many other things I didn’t think of in the little glimpses I’ve had so far thrown in there. Just trying to get a general idea of things to be done.

So I’ll probably spend the time until around Christmas/2011 figuring out Objective-C and figuring out the orientation/differentiation stuff and getting a rough plan, then spend the 3 weeks after that until we come back working on all of that….but this plan will probably explode anyhow, because that’s how plans work. We will see.

Today’s (well, yesterday now) All-Hands Meeting was basically an overview of the Site Hackathon II, which was basically summed up as ‘didn’t finish enough’, followed by some general announcements about break, incoming technology, and how no one responds to e-mails or wiki links on time, apparently. Last was Dominic giving a practice talk about Open Data.

Today’s meeting was interesting in that it ended half an hour earlier than usual, but fit in twice as many topics as usual, although that may have been because the first few were summaries and reminders.

The meeting started with a thanks to everyone who came to the Drupal Site Hackathon, with some discussion of the work that was done, the quality (and lack thereof) of the pizza, and some final tasks that need to be looked at such as the RDFa parsing that is needed to make an improved editor GUI. After that, Tim came up once again to remind everyone to keep an eye on the lab’s efforts to coordinate the SPARQL endpoint information and management, with some explanation of how that was working out so far. The last of the organizational talk was about the planned visual presentations on the monitors that will be placed around Winslow for visitors to be able to see our work on. It didn’t really seem to differ much from the initial meeting, seeing as how there are still debates over whether an approach using powerpoint slideshows, web slideshows, interactive demos, or Concerto should be used.

Next, Dominic talked a bit about the recent trip to D.C and Professor Hendler talked about the great rising interest that is being shown in semantic applications and how large numbers of people stopped him to ask about where and how they could get them. The TWC professors emphasized also that, in the future, all trips ought to be accompanied by a blogpost outlining what they did/saw/heard/thought was interesting/etc.

Finally, someone from dotCIO came to talk about their own approach to the decentralized data of the campus. They have a huge amount of different Content Management Systems and so their solution is to aggregate all of the data and deaggregating it in various ways for individual purposes. The example shown was the way they aggregated all of the building names/data to allow easy tagging to each building, and how this same method applies to people, where everyone’s RCSID’s are annoying to look up and this system allows for much easier tagging. He noted that although this is not as optimal as a semantic solution, they were looking to add applications of things like RDFa.

After that, the meeting ended early so that we could welcome two new people working with the lab to manage finance and funding issues (I think).

I didn’t actually make a post about last Friday’s AHM, as it was almost entirely organizational stuff, recaps, and a presentation on some health ontology application work. Much of this week’s was similar, with meetings and deadlines being discussed as well as a congratulatory speech by Professor Hendler for the collaboration and hard work that he is seeing, which led up to the success at the ISWC 2010.

Patrick also gave a presentation giving a general overview of some of the work he has been doing in Australia with CSIRO, a central scientific organization there. Specifically, he was working with a group looking at applying provenance to a hydrology project, where they use a sensor network to gather and process data to be used in models and forecasts for water usage predictions and decisions. In order to justify and defend the results of such a project, they were looking into provenance to help provide information on what leads to the trust in the results. He mentioned that a big discussion on this was between external and internal provenance, where the CSIRO people could work on provenance within their own scope, but more external provenance that is closer to the actual sensors would be under the jurisdiction and responsibility of other groups. Other issues that he mentioned were in the problem space, provenance collection points, modeling methods, and infrastructure design and development. He also noted that they also helped to provide motivation, validation, and confidence by being there to talk about and really present the concepts in a way that helped bolster the Australian team’s efforts.

Today’s AHM was split into talking about thw ISWC 2010 Demo as well as a presentation about having user annotations in provenance.

Jie started out by giving an overview again of the Semantic Dog Food goals and ISWC 2010 dataset. Alvaro showed his mobile browser, and I presented my filtered browser. It was pretty short, since most of the functions work the same in between areas, so I just showed a bit of what the various displays look like and how the navigation works. No one seemed to have any questions or suggestions, so I didn’t really go into detail about any of the implementation.

Before the meeting, I worked on some more aesthetic changes, as well as enabling local links to use my browser to go to things like people or papers that show up in the retrieved data as objects. I also noticed an issue where any data that had colons in it, such as the literals owl:SameAs, were cut off because of the way I used strtok. I was unable to finish fixing this before the meeting, and I think Professor McGuiness actually noticed, because she asked about the made predicate, one of the areas affected. I did finish fixing this afterwards, however, as well as a similar issue with single quotes, where the single quotes were breaking both the query and the search URL. Professor Hendler noted that I should move the demo to a more appropriate server than my CS account (Evan noted that it’s especially important since the CS servers have been breaking all semester), so I’ll have to look into that.

Some screenshots from the demo:


So, during today’s All-Hands Meeting, we first went over some of the planning from last time, which included the ISWC 2010 visualization planning. As it turns out, the problem with all the date/time information was fixed, so it is now available for my demo idea, and I checked with Evan to make sure that the endpoint I was using is updated/not going to randomly vanish. The presentation during the meeting was an evaluation of various methods tested on Smart Grid technology for inference rules. It went through some different features, such as forward/backward chaining, built-in rules, and subclass relationships and the effects of them on performance and usability.

I also tried to think about my plans for the browse/search demo, which I have definitely decided to attempt to make. I don’t know how far I will get with it, but at the very least, I want to finish some basic browsing capabilities, where the user can click through links to easily access the information, as well as a basic time/schedule display with some filtering capabilities such as by specific date ranges, specific papers/workshops, or something similar. I worked some more on the skeleton/framework code, which is still changing a lot as my plans change on how to implement it, but I think it is almost to where I can start the actual functionality (parsing/displaying).

It’s still really vague, but it’ll become clearer as I work on it and see what I can actually feasibly finish implementing.

Yesterday’s All-Hands meeting consisted mostly of two presentations and a few brainstorming sessions. The first brainstorm was about ideas for making the lab areas more interesting so visitors can get a good first impression. The ideas were mostly based around a Concerto-like system, where TV’s could be placed around the building. Unlike Concerto, most of the suggestions involved sound and video instead of still slideshows, although an overview slideshow was also discussed.

After that discussion, Jie talked about the Semantic Web Dog Food project, which is an attempt to gather semantic metadata for semantic data conferences, workshops, etc. The name reflects its motivation, since it is based on the saying that people working on something should “eat their own dog-food” and use what they make. The project can be found at Jie talked about how the hope is to use it to aid in looking up people/papers/events before, during, and after conferences, and that it is populated with basic data for people/papers/programs/etc using a variety of methods, including spreadsheets, dumps, pdfs, latex, and online scraping. For an upcoming conference, he was also looking to get ideas for and have people finish usable visualizations, browsing tools, and/or searches, which were discussed at the end of the meeting. Unfortunately, I had to leave in the middle of that end discussion to go to class for a midterm, so I don’t have notes on the ideas and what plans were made.

Afterwards were the two presentations, starting with Tim’s talk on trust in aggregated government data. He noted that the process of data aggregation results in possibly untrustworthy data, due to many reasons, such as the opaque “cloud of conversion”, when the information is taken from the raw sources (reliable) and translated by the aggregator (less reliable). A key factor in resolving this is provenance information, information about the information and its sources and such. This is needed for distinguishing between sources, minimizing the number of manual modifications needed during conversion, and for tracing/attribution of data. For the capture of the conversion provenance, he listed the steps as following redirects, retrieving and unzipping the raw data, manual tweaks of it, the conversion, and the population of the dataset. Finally, he suggested a three part system for dataset organization to achieve this trust using provenance data. Each URI would be reached through …/source/…/dataset/…/version/…, where source would be the organization’s DNS, the dataset is the ID, and the version would be some broad/release date/modification time designation. With this system, he further elaborated on other features that would be used in conjunction with this, such as using interpretation knowledge to go from raw data to parameters instead of naïve CSV conversion, renaming properties, typing of a property’s domain, promoting a value to a resource, and more.

The second presentation was by Johanna, who presented about her work over the summer on automatic generation of implicit links in datasets. Basically, there is a problem where all the projects are very local without links to other ones, and there are many ambiguous literals, such as “New York”, and her work was to automatically change these ambiguous literals and turn them into correct links. The program depends on a mapping set and word banks, using methods such as direct matching by regex or keyword as well as approximate matching using edit distance and prefix filtering.

Today’s All-Hands-Meeting was split into two main parts. First were some introductions and a lot of organizational stuff, figuring out who was doing what, with some outlines of current issues with the site migration, such as the site’s search speed (a SPARQL cache in PHP was suggested/delegated). This took a long of the time, so there was only one presentation, which talked about the sameAs construct, and implications/usages of this in terms of very large linked data networks.

The main point seemed to be that using sameAs between pay-level-domains (domain names) rather than between the individual terms allowed for interesting clustering. His examples were based around some category clusterings created by queries using sameAs between domains in a very large dataset from the Billion Triples Challenge.  There were some questions about the assumptions made when using sameAs, such as with the transitivity and symmetric properties when using it.

A point he made at the beginning was how the idea of rdfs:seeAlso evolved into the owl:sameAs construct.

Today I attended my first All-Hands Meeting for the TWC, where several people talked about what they have been working on.  The following is mostly just a summary from my notes during the meeting, since much of the software and languages used are ones I am not familiar with.

The first person summarized what happened at the International Web Reasoning and Rule Systems Conference.  They talked a bit about the SPARQL presentation by Axel Polleres that I had previously looked at when trying to learn SPARQL, which was interesting.  They summarized a lot of points, where the topics included work to go between datalog and description logic, three valued logic (yes/no/unknown) and contradictions, work on a visualization language and translation to concepts, access control on ontologies, and a paper on removing redundancies in RDF, which apparently won Best Paper and had a proof of NP-complete complexity.

After that, the next person talked about work on optimizing real-time RDF data streams, which was basically trying to integrate social/sensor data with the semantic web in real-time, such as having a semantically enabled Twitter widget.  He listed some RDF update formats, such as SPARQL, Talis changesets, delta ontologies, and more, all of which I’ve never really heard of, except for SPARQL.  He also spent some time going over how the data would be transported, using the POST HTTP protocol, or possibly UDP, because it has higher throughput and lower lag, but has more constraints on it.  He noted that HTTP/TCP use persistent connections, but UDP uses small atomic messages, where it allows some data to be lost, but has much faster speed.  He ended with some examples and talk about how such a widget might be implemented and what the limits would be, in terms of throughput, if any.

Finally, the last speaker talked about linked LISP in ontologically oriented programming.  As a project, he had apparently written a LISP interpreter in Java, with semantic web integration.  The architecture was an ANTLR-based grammar, with types from Java and the Jena memory model.  There was some discussion of how the various functors were implemented, and how the semantic web was integrating using the Jena memory model and inference engine.