GuildWiki

GuildWiki has been locked down: anonymous editing and account creation are disabled. Current registered users are unaffected. Leave any comments on the Community Portal.

READ MORE

GuildWiki
Advertisement

Purpose[]

This bot's purpose is to remind users who have uploaded new images without licensing information to add that information to their images.

Functioning[]

Unlike other bot tasks, this task would be an ongoing one. Every Monday and Thursday the bot would scan Special:Newimages (it can compile a list of files created since the last time the bot was run). Any files lacking licensing information will cause the bot to leave the following message on the uploader's talk page:

Image licensing reminder[]

Hello, <username>. You are receiving this automated message because you recently uploaded files to GuildWiki. The following images appear to be missing licensing information: <list images>. Please see GuildWiki:Image license guide for more information. Thank you for your contributions. This is an automated message from <botname><date>

Language and versions[]

This bot is programmed and run in Java and uses the Java Wiki Bot Framework. The current version posted is the final release, pending notes offered on the talk page. It is version 1.9. The bot will refer to the program as IR 1.9 to indicate the program and version number.

Issues[]

  • Perhaps this should be its own bot?
    • Not necessary. It won't be constantly running.
  • How often should this run?
    • Once a day is plenty. I'd say once a week is a good time but you don't want to lose users who may only log once in a while. The goal is to get users to notice rather quickly that there is an issue. The bot, however, can comfortably navigate any time interval and go through several pages on Special:Newimages. The bot will probably be run twice a week at most.
  • If a user uploaded multiple files, do they get a message for each?
    • One message listing all images that need licenses found in that run.
  • Can we also check that certain licenses include all fields?
  • Can we also check that images aren't named GW###.jpg or similar?
    • Those two tasks may be run separately. They will not be included in the scope for this bot.
  • If the user's talk page is a redirect to another page, then it will still leave a message under the redirect.
    • Redirects on userpages are mainly used to direct an IP to a registered account page or if the user has had their name changed. Since IP's cannot upload files anyway, the first part is not an issue. As for the second part, if the user has had their account changed, they will probably not be uploading files from the old account.
  • If the last run date on this page is tampered with, the bot will run with the wrong start date.
    • The start date will be read off local machine and only copied here for reference.
  • Bot edits marked as minor do not trigger the new messages notice.
    • This bot's edits will not be marked a minor. Since the edits don't appear in recent changes by default anyway, this should not be an issue.

Code[]

Note: including actual code here throws off the bot. MUST be shown as an inclusion

/**
 * The following is NOT released under CC 2.0 by-nc-sa license. It may only be used with the 
 * expressed consent of the copyright holder(s). JavaDoc and/or .java files available on request.
 * Program referred to as IR for Image Reminder.
 * @author Heather Arbiter (c) (harbiter@gmail.com)
 * @version 2.4
 */
import net.sourceforge.jwbf.bots.MediaWikiBot;
import net.sourceforge.jwbf.contentRep.mw.SimpleArticle;
import org.w3c.dom.*;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import java.util.*;
import wikitest.ShortListException;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.BufferedWriter;
import java.io.File;
import javax.swing.JOptionPane;

public class Main {
    //Used to parse the DOM retrieved from Special:Newimages in various methods
    static final int GAL_DIV_ONE_ROOT = 11; //number of divs until the first gallery node
    static final int GAL_DIV_USER = 4; //index user as a child of gallerytext node
    static final int GAL_DIV_NAME = 1; //index name as a child of gallerytext node
    static final int GALTEXT_INTERVAL = 4; //number of divs between galtext nodes
    static final int MAX_FILES_PER_PAGE = 48; //number of files per pageview
    static final int A_LASTRUN_INDEX = 10; //number of anchor tags to cur run time
    static final int A_VIEW_PREV_INDEX = 9;
   //version for edit summary
    static final double VERSION = 2.4;
    
    public static void main(String[] args) throws Exception {              
        MediaWikiBot bot;
        try{
            bot = getNewBot(args[0], args[1]);
        }catch(Exception e){
            print("Error: Unable to log in. Check info.");
            return;
        }
        String oldDate;
        try{           
            oldDate = getLastRunDate();
        }catch(java.io.FileNotFoundException e){
            print("Error: " + e.toString());
            return;
        }        
        Document doc = getDocument("{{fullurl:Special:Newimages|from="}} + oldDate);
        HashMap remindList = new HashMap();
        while(hasNextPage(doc) && shortList == false){
            try{
                doc = getDocument("http://guildwars.wikia.com" + getNextPage(doc));        
            }catch(org.xml.sax.SAXParseException e){ //catch parse exception with a scream
                print("Error: " + e.getMessage() + 
                        "on the following page: http://guildwars.wikia.com/" + getNextPage(doc) +
                        ". Program terminating.s");
                return;
            }
            
            //shortList test. 
            int filesOnPage = getFilesOnPage(doc);
            boolean shortList = filesOnPage != MAX_FILES_PER_PAGE;
            
            NodeList el = doc.getElementsByTagName("div");
                updateRemindList(bot, getUserList(el, filesOnPage), getImageList(el, filesOnPage), remindList);
        }


        // leaveTestSummery(bot, remindList);    //test only
        postReminders(bot, remindList);       //real run
        // getEndDate(doc, shortList);
        postEndDate(bot, getEndDate(doc, shortList));
        print("Done.");
    }
    
 /**
  * Instantiate the bot and log it in.
  * @param botName username of the bot
  * @param pass the password for the bot
  * @return MediaWikiBot the logged in bot
 */
    public static MediaWikiBot getNewBot(String botName, String pass) throws Exception{       
            MediaWikiBot b = new MediaWikiBot("http://guildwars.wikia.com");
            b.login(botName, pass);
            return b;
        
    }    
    
    /**
     * returns the last edit date from GuildWiki:Bot_tasks/License_reminder
     * @param b botname
     * @return the last edit date. 
     * @deprecated v1.6 to getLastRunDate() to read from file
     */
    public static String getLastRunDate(MediaWikiBot b) throws Exception{
        SimpleArticle sa = new SimpleArticle(b.readContent("GuildWiki:Bot_tasks/License_reminder"));
        String s = sa.getText();
        int x = s.indexOf("'''Last Run:'''");   
        x += 15; 
        s = s.substring(x);
        s.trim();
        print("last run: " + s);
        return s;
    }
    
    /**
     * creates a new DOM document (to help parse special pages)
     * @param pageName the name of the page you want to parse
     * @return new DOM document based on given pagename
     * @throws java.lang.Exception
     */
    public static Document getDocument(String pageName) throws Exception{
        DocumentBuilderFactory domBF = DocumentBuilderFactory.newInstance();
        DocumentBuilder domB = domBF.newDocumentBuilder();
        return domB.parse(pageName);     
    }
    
    /**
     * leaves a message on the given user's talk page. 
     * @param b the current bot
     * @param userName userName of page to leave message one (EXCLUDING namespace)
     * @throws java.lang.Exception
     * @deprecated v1.2 to leaveMessage(bot, userName, msg)
     */
    public static void leaveMessage(MediaWikiBot b, String userName) throws Exception{
        SimpleArticle sa = new SimpleArticle(b.readContent("User talk:" + userName));
        sa.addText("\n==Test edit==\n" +
                "You are receiving this edit as a test. " +
                "This is a test edit from ~~~~");
        sa.setEditSummary("test edit using JWBF");
        sa.setMinorEdit(true);
        b.writeContent(sa);       
    }
        /**
     * leaves a message on the given user's talk page. 
     * @param b the current bot
     * @param userName userName of page to leave message one (EXCLUDING namespace)
     * @param msg the message to leave
     * @throws java.lang.Exception
     */
    public static void leaveMessage(MediaWikiBot b, String userName, String msg) throws Exception{
        SimpleArticle sa = new SimpleArticle(b.readContent("User talk:" + userName));
        sa.addText(msg +
                " <small>This is an automated message from ~~~~</small>");
        sa.setEditSummary("[[GuildWiki:Bot tasks/License reminder]] - JWBF - IR " + VERSION);
        sa.setMinorEdit(false); //setting true will not trigger new messages notice
        b.writeContent(sa);       
    }
    
    /**
     * constructs a list of all usernames.
     * @param n the nodelist of all divs
     * @param files the filse onthe page
     * @throws ShortListException
     * @return an ArrayList consisting of all usernames who recently uploaded files
     */
    public static ArrayList getUserList(NodeList n, int files) throws wikitest.ShortListException{        
        ArrayList list = new ArrayList();
        for(int i=0; i<GALTEXT_INTERVAL*files-4; i = i+GALTEXT_INTERVAL){
            try{
               list.add(getUserName(n.item(GAL_DIV_ONE_ROOT + i)));               
            }catch(Exception e){ 
                throw new ShortListException("User List truncated.");
            }
        }
        return list;
    }
    
    /**
     * constructs a list of all images
     * @param n the nodelist of all divs
     * @param files the number of files on the given page
     * @throws ShortListException
     * @return an ArrayList consisting of all images who were recently uploaded
     */
    public static ArrayList getImageList(NodeList n, int files) throws wikitest.ShortListException{
        ArrayList list = new ArrayList();
        for(int i=0; i<GALTEXT_INTERVAL*files-4; i = i+GALTEXT_INTERVAL){
            try{
               list.add(getImageName(n.item(GAL_DIV_ONE_ROOT + i)));
            }catch(Exception e){ //this should not be thrown b/c getUserList() should throw exception first
                throw new ShortListException("Image list truncated.");
            }
        }
        return list;
    }
    
     /**
     * gets the username of a single entry on the new images list
     * @param n current gallerytext node in the list of divs
     * @return the name of the user in that gallerytext box. EXCLUDES namespace
     */
    public static String getUserName(Node n){
        String fullname = n.getChildNodes().item(GAL_DIV_USER).getAttributes().getNamedItem("title").toString();
        return  fullname.substring(12, fullname.length()-1); //trim +namespace before returning
        
    }
    
    /**
     * gets the image name of a single entry on the new images list
     * @param n current gallerytext node in the list of divs
     * @return the name of the image in that gallerytext box. INCLUDES namespace
     */
    public static String getImageName(Node n){
       String imagename = n.getChildNodes().item(GAL_DIV_NAME).getAttributes().getNamedItem("title").toString();
       return  imagename.substring(7, imagename.length()-1); //trim before returning
    }
    
    /**
     * prints out a summery of users and the files they uploaded as constructed by this bot. to console
     * @param users users
     * @param images images they uploaded
     */
    public static void printResultList(ArrayList users, ArrayList images){
        for(int i=0; i<users.size(); i++){
            System.out.println(users.get(i) + " uploaded " + images.get(i));
        }
    }
    
    /**
     * Determines if the given image is lacking licensing information
     * @param bot the bot
     * @param imageName the name of the image to check (INCLUDING namespace)
     * @return true if lacking info; false if info is okay
     */
    public static boolean isNeedingLicense(MediaWikiBot bot, String imageName) throws Exception{
        SimpleArticle sa = new SimpleArticle(bot.readContent(imageName));
        String info = sa.getText();
        info = info.toLowerCase();
        if(     info.contains("{{screenshot") ||
                info.contains("{{fansite kit image") ||
                info.contains("{{user-created image") ||
                info.contains("{{public domain image") ||
                info.contains("{{fair-use image") ||
                info.contains("{{gfdl image") ||
                info.contains("{{mediawiki screenshot")){
            return false;
        }
        postUnattributedNotice(bot, sa);
        return true;
    }
        
    /**
     * Builds a HashMap of String User --> ArrayList Images of images that need licensing information. SHOULD NOT BUILD A NEW HASH MAP. SHOULD CONTIBUE AN EXOSTING MAP
     * @param bot the bot
     * @param users the list of all users who uploaded files
     * @param images the list of all files recently uploaded
     * @return the built HashMap
     * @throws java.lang.Exception
     * @deprecated v1.2 to updateRemindList()
     */
    public static HashMap getReminderList(MediaWikiBot bot, ArrayList users, ArrayList images) throws Exception{
        HashMap hm = new HashMap();
        for(int i=0; i<images.size(); i++){
            String curImage = (String) images.get(i);
            if(isNeedingLicense(bot, curImage)){ //if the image needs licensing
                if(hm.containsKey(users.get(i))){ //if the user is already in teh map
                    ArrayList imgvals = (ArrayList) hm.get(users.get(i));  //get teh list of images going for that user
                   imgvals.add(curImage); //add this image
                }else{ //start a new user and list
                    ArrayList a = new ArrayList();
                    a.add(curImage);
                    hm.put(users.get(i), a);
                }   
            }
        }
        return hm;
    }
    
    /**
     * Leaves a test summery on RogueJedi's page. Indicates what users will receive what messages.
     * @param bot the bot
     * @param map mapping of users to image lists to navigate
     * @throws java.lang.Exception
     */
    public static void leaveTestSummery(MediaWikiBot bot, HashMap map) throws Exception{
        for(Iterator iter = map.entrySet().iterator(); iter.hasNext(); ){
            Map.Entry entry = (Map.Entry) iter.next();
            String curKey = (String) entry.getKey(); //next user
            String imgs = ""; //reset per user
            ArrayList imgList = (ArrayList) entry.getValue(); //users image list
            for(Iterator i = imgList.iterator(); i.hasNext(); ){ //add each entry to a string
                imgs += "[[:" + i.next() + "]], ";
            }
            imgs = imgs.substring(0, imgs.length()-2);
            leaveMessage(bot, "GW-RogueJedi", "\n==Image licensing reminder to " + curKey + "==\n" +
                    "Hello, " + curKey + ". You are receiving this automated message because you " +
                    "recently uploaded files to GuildWiki. " +
                    "The following images appear to be missing licensing information: " + imgs +
                    ". Please see [[GuildWiki:Image license guide]] for more information. " +
                    "Thank you for your contributions. "                    );
        }       
    }    
    
    /**
     * Checks to see if there are more files which were uploaded after the current set being viewed
     * @param doc the document object of the current page
     * @return if there are more images uploaded after the current 48
     */
        public static boolean hasNextPage(Document doc){
            //wont be in a tag"View (previous 48)" 
            //if the content of the A_VIEW_PREV_INDEX is "next" instead of "previous" then there are no previous. 
            String x = doc.getElementsByTagName("a").item(A_VIEW_PREV_INDEX).getTextContent();
            return !x.contains("next 48");
        }
        
        /**
         * Gets the URL of the page of teh next set of images. (looks at view previous link b/c list is reverse order)
         * @param doc document object of the current page
         * @return the url of "view previous" which is a link to the next 48
         */
        public static String getNextPage(Document doc){
            String s = doc.getElementsByTagName("a").item(A_VIEW_PREV_INDEX).getAttributes().getNamedItem("href").toString();
            s = s.substring(6, s.length()-1);
            print("next url: " + s);
            return s;
        }
        
        /**
         * Returns teh current date/time indicated by the "show new images as of" on teh current page
         * If the last page checked doesn't have exactly 48 images then the "next 48" isn't a link. 
         * See {{fullurl:Special:Newimages|from=20080614014600}} for a page that doesn't throw the short list exception 
         * @param doc DOM for the current page. Should be the most recent 48
         * @param off set to true if the div should be off by one because there are neither next nor prev
         * @return a string in the form YYYYMMDDHHMMSS indicating the time of the last view on the bot's run
         */        
        public static String getEndDate(Document doc, boolean off){
            int err = A_VIEW_PREV_INDEX+1;
            if(off) err--;
            String s = doc.getElementsByTagName("a").item(err).getAttributes().getNamedItem("href").toString();
            //System.out.println(s);
            s = s.substring(46, s.length()-1); //46 is the offset to the date
            //print("new last run date:" +s);
            return s;
        }
        
        /**
         * Replaces the previous last run date on GuildWiki:Bot tasks/License reminder with the given date
         * Also writes to file
         * @param bot the bot
         * @param date the date to replace it with in form YYYYMMDDHHMMSS
         * @throws java.lang.Exception
         */
        public static void postEndDate(MediaWikiBot bot, String date) throws Exception{
            //WRITE TO ARTICLE            
             SimpleArticle sa = new SimpleArticle(bot.readContent("GuildWiki:Bot_tasks/License_reminder"));
             String s = sa.getText();
             int x = s.indexOf("'''Last Run:'''");   
             x += 15; 
             s = s.substring(0, x);
             s += date;   
             sa.setEditSummary("[[GuildWiki:Bot tasks/License reminder]] - IR " + VERSION + " Updating last run date/time.");
             sa.setText(s);
             bot.writeContent(sa);         
             //WRITE TO FILE
             BufferedWriter writer = new BufferedWriter(new FileWriter("lastrun.txt"));
             writer.write(date);            
             writer.close();
        }
        
        /**
         * Updates the given HashMapping of String users --> ArrayList images with the current group of users and images
         * @param bot the bot
         * @param users uploaders of a group of files
         * @param images the file names they uploaded. index of file is same as index of uploader
         * @param hm the provided hash map to update
         * @throws java.lang.Exception
         */
        public static void updateRemindList(MediaWikiBot bot, ArrayList users, ArrayList images, HashMap hm) throws Exception{
            for(int i=0; i<images.size(); i++){
                String curImage = (String) images.get(i);
                if(isNeedingLicense(bot, curImage)){ //if the image needs licensing
                    if(hm.containsKey(users.get(i))){ //if the user is already in teh map
                        ArrayList imgvals = (ArrayList) hm.get(users.get(i));  //get teh list of images going for that user
                        imgvals.add(curImage); //add this image
                    }else{ //start a new user and list
                        ArrayList a = new ArrayList();
                        a.add(curImage);
                        hm.put(users.get(i), a);
                    }   
                }
            }
        }
        
        /**
         * Posts the reminder messages on teh individual user pages 
         * @param bot the media wikibot
         * @param map Mapping of users and images
         * @throws java.lang.Exception
         */
        public static void postReminders(MediaWikiBot bot, HashMap map ) throws Exception{
            for(Iterator iter = map.entrySet().iterator(); iter.hasNext(); ){ //more users
                Map.Entry entry = (Map.Entry) iter.next();
                String curKey = (String) entry.getKey(); //next user
                String imgs = ""; //reset per user
                ArrayList imgList = (ArrayList) entry.getValue(); //users image list
                for(Iterator i = imgList.iterator(); i.hasNext(); ){ //add each entry to a string
                    imgs += "[[:" + i.next() + "]], ";
                }
                imgs = imgs.substring(0, imgs.length()-2); //trim last comma
                leaveMessage(bot, curKey, "\n\n==Image licensing reminder==\n" +
                        "Hello, " + curKey + ". You are receiving this automated message because you " +
                        "recently uploaded files to GuildWiki. " +
                        "The following images appear to be missing licensing information: " + imgs +
                        ". Please see [[GuildWiki:Image license guide]] for more information. " +
                        "Thank you for your contributions. "                    );
            }
        }           

    /**
     * returns the last edit date from file in form of YYYYMMDDHHSS
     * @return the last edit date. 
     */
    public static String getLastRunDate() throws Exception{
       File f = new File("lastrun.txt");
       BufferedReader reader = new BufferedReader(new FileReader(f));
       String s = reader.readLine();
        print("last run: " + s);
        reader.close(); //dont allow memory leak!
        return s;
    }

     /**
      * Prints output; rewrite to change printing result; later versions use popups
      * @param output
      */
    public static void print(String output){
        JOptionPane pop = new JOptionPane();
        pop.showMessageDialog(pop, output);
        System.out.println(output);
    }
    
    
    /**
     * Posts {{unattributed image|~~~~}} on a given article. Called directly from isNeedingLicense to reduce lag
     * @param bot the bot
     * @param sa the article defining the text of an image; image should already have been determined to need tag
     * @throws java.lang.Exception
     */
    public static void postUnattributedNotice(MediaWikiBot bot, SimpleArticle sa) throws Exception{
        sa.addText("{{unattributed image|~~~~~}}");
        sa.setEditSummary("[[GuildWiki:Bot tasks/License reminder]] - JWBF - IR " + VERSION);
        sa.setMinorEdit(true);
        bot.writeContent(sa);     
    }
    
    
    public static int getFilesOnPage(Document doc){
        NodeList bs = doc.getElementsByTagName("b");
        String bk = ((CharacterData) bs.item(0).getFirstChild()).getData();
        return Integer.parseInt(bk);
    }
}

Sample output[]

Testing and sample outputs are usually shown on User talk:RogueJedi.

Last run[]

When the bot runs, it will update the timestamp at the bottom of this section as a courtesy. Notice that the bot actually reads the start date from a local file, so changing this date will not directly affect the bot.


Last Run:20080707191013

Advertisement