Friday 9 November 2012

Not quite there: fun with Google Image recognition and search

There's something playful about the way researching one topic often leads to new insights on another.  While attempting to track down copyright information for a test file, I decided to try out some of Google's newer image recognition and search capabilities, such as the browser-based Google Image and the mobile app, Google Goggles.

The underlying idea is pretty wonderful – instead of trying to think up just the right search terms, simply snap a picture or upload an existing image, and the magic of Google will find images “like” it.   This could be great tool for designers who may be looking for similar but slightly different images to fit into a specific design project.  Could the promise of algorithmic image recognition reduce the need to apply keywords and descriptions to our images?

Thursday 1 November 2012

metadata - use it or lose it

Over this fall I have recruited representatives from our participating units who are working with me to simplify our digital asset management system.  Not surprisingly, we have no shortage of ideas for new functionality or updates!  However, our challenge has been to develop recommendations that we could implement ourselves, without requiring expensive customizations.  We focused on simplifying the descriptive metadata that must be applied to assets when importing.
 
As discussed in my previous post, there are principles we can apply when considering which metadata fields to hide or remove.  Streamlining also means reworking the remaining fields so they are easier to use for all groups.  Here's a few more general principles derived from experience and information standards:
  1. Use it or lose it
  2. Mutual exclusivity
  3. Use the language of your users
  4. Use terms important to organizational context

Tuesday 16 October 2012

Principles for simplifying DAM file descriptions

Our digital asset management project was launched with a goal of making it easier to find, share and reuse images and graphic assets across the City of Toronto.   Participating staff can now access images from the official photographers and the City Planning photographer from a central location.  After 1 year of use, we are in a good position to look for opportunities to improve for existing and future users.

When the system went live in 2011, it incorporated the requirements of four different business units.   During the development process, metadata fields were added anticipating future functionality or to create an exhaustive list of descriptive information.  The list grew as each group identified what seemed to be unique details or processes.   The result was a fairly large set of metadata (50 fields!).   Before launch, we reviewed the overal set and developed a custom profile for each group.  Here's an example, from our Photo Video unit:



You can imagine that this list might be somewhat daunting for new users!   Keep in mind that some fields are left blank for certain files, and some are automatically captured.

"Make it simpler"

If all our staff feedback could be reduced into a single idea, this would be it.  Even a seasoned cataloger might be challenged by our set of metadata at first glance.  Given our distributed model, in which individual units and staff are responsible for uploading and describing their own content, overly complex metadata represents a serious impediment to consistent adoption for graphic design project files.  

This review project will reduce the barriers to sharing by streamlining description, upload and searching for files.  Some fields might be "nice to have", but are they essential to share photos?   Are they necessary to describe most graphic design project files in Information Production and City Planning?  We will consider the benefits of each metadata field against the staff resources required to complete them.

Principles to consider when evaluating metadata fields to hide or remove:

Fields with no current functionality. Example: Records Classification Code. This was included anticipating lifecycle management, but was not budgeted for in our first phase.  The field can be hidden.  It may be reused for administrators only at a later date, if we choose to implement.

Fields that are unused.  Example: master/source file location.  Intended to capture the location of high-resolution master video files, this field has seen limited use with only a small set of videos uploaded.  Managing the master-derivative file relationship is a larger issue that requires an approach across all asset types.  The field can be hidden.

Fields that duplicate information available in other systems.  Example: Creator contact info.   Static contact information stored in DAL becomes less useful over time as staff move to new positions or leave the City.  City staff contact information is also available from our email system.   This field can be hidden.

Fields that are not needed for a user group's primary content.  Example: Date(s) of Creation.  This Archives field provides space for a variety of date formats when the actual date the photograph was shot is unknown.  A sample value might be ca. 1910. This is important descriptive information for Archives, but not needed for groups whose content is born-digital and have another field with a validated timestamp format.  This field can be hidden for non-Archives groups. It may be made visible to other groups when needed for search, as part of a future customization.

Fields that are only required for specific asset types. Example: Related Images.  This field is only used when Consent Forms or Client Request Forms are uploaded and linked to their related images.  It can be made visible only when those asset types are selected.

These principles should help guide our work to simplify our system and lay the groundwork for expanded adoption.  In my next post, I will examine options to improve the remaining core fields and controlled lists.

Wednesday 22 August 2012

Reuse - the beauty of embedded metadata

My Digital Asset Library (DAL) plans for this year included improving our capacity to manage working files and making better use of embedded metadata. These two goals came together as several staff in Information Production and City Planning participated in a working files pilot. Their feedback made clear we need to make it easier to upload, describe and reuse files. Both Information Production and City Planning, Graphics & Visualization have similar business processes, creating new design projects using Adobe Illustrator and InDesign, linked to the raw graphic & image files, and often reusing existing content.

Pilot participants noted that applying detailed metadata to the many distinct files that make up a single design product could be very time consuming. This process is quite different from uploading a batch of event photos, which all have the same metadata. We decided to re-examine how we can simplify the process of uploading and tagging files.

As a first step, we focused on making it easier to reuse files and metadata. If an event photo or graphic has already been described in DAL, we wanted that information to stay with the file once downloaded.  It became clear we needed to develop an embedded metadata standard. Using Adobe's XMP technology, embedded metadata allows descriptive information (such as creator or keywords) to be written directly in the file and be read and written by different compatible applications. The first implementation of the standard will automatically apply metadata to files on download from DAL. This will save time since the files will not need to be described again if they are modified, reused and/or uploaded.

 Over the summer I've worked with representatives from each business unit to develop our embedded metadata standard. I'm pretty happy we were able to identify a core set that everyone could agree on. Taking the time to debate the implications of choosing particular standards and mappings was very useful to gain consensus. Throughout the process I emphasized that this standard was our first version and not set in stone. It will change as we gain more experience and our requirements evolve. I think acknowledging the provisional nature of this particular set of metadata properties was very helpful.

We were able to build on existing mappings used to embed metadata in Archives' photos prior to upload into DAL. Looking outside the City, the Smithsonian Institution provided excellent perpective and advice with their Basic Guidelines for Minimal Descriptive Embedded Metadata in Digital Images.

We've identified 16 fields that can be embedded. Not all fields are used by all participating groups. Some are interoperable with the Dublin Core Metadata Standard, the City of Toronto Descriptive Standard and IPTC (the default fields used in Adobe CS products), while some are DAL custom fields.


We are now working to secure funding for professional services to implement our standard in DAL. Working closely with the business representatives, I have a much better understanding of their needs. Using this standard for embedding metadata in downloaded files is really a first step towards an improved model for managing working files. We will continue to plan further improvements to our metadata creation and uploading model to drive adoption, file reuse and effective search.

Thursday 12 July 2012

Edmonton's move to Google Apps

Well, it's finally happened.   A large Canadian municipality, Edmonton, has decided to make the jump to Google for email and office productivity.  In the past, City of Toronto staff raised concerns about how this would benefit service delivery and what security risks this might bring.
According to Edmonton's press release, benefits include the ease of accessing these tools from any desktop or mobile device, real-time collaboration capabilities, expansion to staff currently without email, and reducing anticipated future costs, so "staff will spend less time maintaining systems, allowing them to dedicate more resources to the important mission of serving the citizens of Edmonton".  They also note that "Google Apps has completed a rigorous security certification and accreditation with the U.S. federal government."

City Clerk's Office staff will naturally have more questions about privacy, access and security.  This Edmonton Journal article  provides some additional context on security and dismisses the issue of storing data in the US.   According to Edmonton's CIO Chris Moore, Edmonton will continue to own all their data.  He adds that concerns about the Patriot Act are a red herring since Edmonton's email traffic already travels through the US.   The University of Alberta (with 125 000 users) has also begun shifting to Google apps, and their vice-provost of IT notes that "during 18 months of study, he determined the company's information security and privacy is far tighter than anything he could afford to provide."

For a more critical analysis, I recommend this 2-part blog post and interview with Edmonton's CIO  here.   The interview includes discussion of legal implications, Edmonton's completed Privacy Impact Assessments, the City Clerk's Office and Freedom of Information and Privacy Office.   In the interview and part 2 post, the author questions why the City of Edmonton's Privacy Impact Assessments have not been submitted to the Office of the Information and Privacy Commissioner of Alberta and the Office of the Privacy Commissioner of Canada for advice and recommendations.  This is not mandatory, but was voluntarily submitted by the University of Alberta for their project.   Recommendations from the Commissioner included notifying all staff and students that their email is now under US legal jurisdiction, the university cannot guarantee it will be protected from disclosure and to carefully consider what information they chose to send.

I have to admit (if it wasn't obvious already) that a compelling case has been made, with some reservations.   The collaboration capabilities and mobile access of Google apps are impressive.   From an information management perspective, the integration with Google's Vault feature, providing retention, archiving and legal hold functionality, is very appealing.   The project is clearly being driven by Edmonton's IT department, but is being carried out in consultation with their City Clerks' Office and Legal partners.  I would love to hear their thoughts on the project, especially regarding the decision not to solicit further advice on their Privacy Impact Assessment.   As the City of Toronto sets out on its own more traditional implementation of EDRMS, we should monitor Edmonton's progress closely and evaluate how it could inform our own approach.

Thursday 5 July 2012

Liberation Technology - #OLITA 2012 Conference #opengov

A few weeks ago I attended the Ontario Library Information Technology Association's 2012 conference here in Toronto, hosted at the Toronto Reference Library. The focus of the conference was the use of technology to deliver social goods and social justice. Sessions included open data in a library context, inclusive and accessible design research, and the need for copyright to enable creativity and reuse. While I don't work in a public library (or in IT, for that matter), the topics jumped out at me with connections to projects I've been working on. The full conference program is here.



Open Data is dead! Long live Open Data! - MJ Suhonos

Suhonos advocated for open data as a natural evolution of the world wide web, with network effects that enable people to do new, unexpected things with existing resources. Unreleased data is really unrealized potential that can only be unlocked by sharing, combining, and using it in new ways. These arguments were familiar to me, and this really reinforced how successful the City of Toronto's Open Data program has been. Open Data in public libraries appears to be at an earlier stage. One useful open data evaluation tool I hadn't heard of before was Tim Berners Lee's 5 star system for ranking Open Data programs. It provides a concise evaluation framework for data providers and advocates to make open data programs as useful as possible.

Citizen Archivists: Transcribing History for Future Generations - Rebecka Sheffield

Since my position has recently been transferred to the Archives, I've become more interested in how we can better capture contextual information about images, graphics and design projects, to support archival descriptions. This talk re-inforced the value of distributed, public annotation and sharing while acknowledging that we are still challenged how to best incorporate this information into standard structures such as the Rules for Archival Description.

Outside In - Jutta Treviranus

Treviranus is the director of the Inclusive Design Research Centre (IDRC) at OCAD University. She delivered a wide ranging talk on the value of inclusive, accessible design and the work of IDRC. Much like the principles of privacy and access by design, inclusive design should include the needs of all users at the design phase rather than attempting to bolt on or modify near the end of design work. I was intrigued by the proposed open model of portable user preferences (font size, screen reading) allowing any system to seamlessly recognize and adapt. "By Design" is an excellent principle, but I can't help wondering if it means a (necessary?) brake on the culture of creativity, experimentation and fast prototyping of open data and hacker culture encouraged by other conference speakers. Reusable models and tools can hopefully streamline the incorporation of these inclusive design.

A Critical Account of Copyright in Canada - Carys Craig

Improving reuse of content is a key goal of the project I am working on, so I was very interested in getting a better understanding of copyright in Canada. Professor Craig delivered a fascinating and wide ranging talk which questioned some underlying assumptions and narratives about copyright. She was particularly critical of the rising narrative of copyright as total ownership, expanding and strengthening protections. In fact, the origins of copyright are in encouraging creative expression and intellectual dissemination. Expressing concerns about the direction of the Canadian copyright bill, C-11, she encouraged us to advocate for a more balanced approach, calling the digital locks provisions "electric fences" against use. As I work towards an improved model for access and reuse inside our DAM, I'm particularly trying to avoid an overly restrictive, technical approach to rights management. Copyright should reward creativity by respecting ownership, but it must also protect the public good by enabling fair use and creative innovation. Craig has recently published Copyright, Communication and Culture: Towards a Relational Theory of Copyright Law which I look forward to reading.

Big thanks to the organizers for putting together a broad ranging and affordable conference. I especially appreciated the mix of theoretical and practical content, examining challenges and opportunities inside and beyond Ontario libraries.

Thursday 19 April 2012

City of Toronto's Digital Asset Library - importing trends #DAM

Everyone loves measuring performance with statistics right?   At this point, we don't currently have a standard set of reports that can be automatically output from our Digital Asset Library (DAL).  I have to manually run a series of searches and compile information onto a spreadsheet, then output a chart.  Not the most elegant of solutions, but does provide a bird's eye view of our progress so far:





Observations:

Photo Video unit is by far the most active uploader into DAL.  This is not surprising because they were the originating unit of the DAL project, and the project requirements were tailored most closely to their needs.   They have less complex file types (typically TIFFs & JPEGs) than some of our other participating groups.  They also produce a large volume of photographs of City events which can be quickly tagged and uploaded as batches.  To their credit, they have been dedicated uploaders, typically importing hundreds of photos in batches each week.

Both Graphics & Visualization, City Planning and Information Production, City Clerk's Office have a steady but moderate growth in their files uploaded.  Specific staff have been using DAL for sharing completed projects (as PDFs).   An individual design project naturally takes more time to complete than a single event photo, and therefore there are fewer overall files than Photo Video.  However, these numbers also make clear that we must continue expanding our pilot managing more complex working Adobe Illustrator & InDesign files with placed art, leading to full adoption throughout these units.  These files take more time to upload and tag, as each individual working file has unique metadata.  We are currently planning an embedded XMP metadata project which should reduce workload when reusing working files in new projects.

Archives are using DAL to upload batches of JPEG access copies, which have embedded XMP metadata.  The fall-off in imports in early 2012 reflects some challenges in updating application components that allow DAL to read embedded metadata on import.  There is a learning curve for sustaining and managing new systems and we are certainly experiencing this.  We have a plan to resolve this issue in the near future, so batch imports can resume later in 2012.

This snapshot of DAL uploads reinforces the need for better reporting, which is essential for identifying trends and areas for improvement.  We are planning to upgrade our reporting capabilities in 2013.

Monday 26 March 2012

digital asset management at the City of Toronto - priorities for 2012 #DAM

2012 is a year of ongoing improvements and next phase planning for our Digital Asset Library.  We are focusing on improving our standards and processes around working files, embedded metadata and copyright.  We are also gathering requirements to build a business case for our next phase - incorporating feedback, functional needs as well as necessary application and environment upgrades to support growing numbers of users and content.

What has really struck me over the last year I've been working on this project is the great interest from City staff  in better image management and reuse.    Staff really do recognize the value of sharing assets between units.  In 2012, we'll be laying the groundwork for improved functionality, new partnerships and better sharing of resources.   Here's an overview of DAL priorities, as I see them, over the next year.  Your feedback is welcome.

Wednesday 11 January 2012

DAM project feedback from #Toronto Urban Design staff #EIM #OpenGov #ToCouncil



Last year, I had the opportunity to introduce our Digital Asset Library project to the City Planning Urban Design team at a staff meeting in Etobicoke.   Our first phase is a pilot with a smaller unit that is part of Urban Design, the Graphics and Visualization unit, but we wanted to provide an update to the larger team.  I sat in on the full meeting, and it was fascinating to get a big picture view of design and planning projects underway.  I was intrigued by work to connect South Etobicoke and Mimico with the waterfront, and integrate a growing Humber College into the surrounding community and historic campus. 


There were some great questions during my presentation and many staff took the time to speak to me afterwards with their thoughts.  A few key issues that came out of our conversations:


Creative Commons License
This work by Jonathan Studiman is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.