This site is 100% ad supported. Please add an exception to adblock for this site.

Mobile Robotics Quiz 3

Terms

undefined, object
copy deck
Why do Machine Learning on Robots?
1. We\'re lazy 2. It could be faster and better than we could ever imagine. 3. Programming a robot to do something specifically can be hard.
Possible Problems with Machine Learning
1. Learning is hard. 2. May not be faster than hard-coding.
3 Types of Learning
1. Control 2. Models 3. Robot Models
Control Learning
Learning a function that maps sensor readings to actions.
Model Learning
Learning maps or representations of things in the world.
Robot Model Learning
Learning sensor models, kinematics models, and error models
Supervised Learning
Learning to generalize from examples
Unsupervised Learning
Form mappings that optimize some criteria.
Classification Learning
Class labeling given some inputs.
Classification Algorithms
1. Nearest Neighbor 2. Decision Trees 3. K-means clustering
Regression Learning
A type of learning in which the algorithm learns functions or mappings
Regression Learning Algorithms
1. Linear regression 2. Neural networks
More Issues with ML
1. Supervised algorithms are only good at interpolating, not good at extrapolating. 2. Needs lots of data - millions of data points. 3. Learning often doesn\'t work, and needs lots of tweaking.
ALVINN - What is it?
Autonomous Land Vehicle in a Neural Network - experiment in automating driving with a neural network
ALVINN Neural Network Arrangement
Greyscale image from a camera (30 x 32) arranged into a vector is fed into a NN. 960 input units, 4 unit hidden layer 30 output units. Highest value of the 30 outputs equals steering direction.
ALVINN - First results
Did well... in areas of the state space. But, it didn\'t perform well near the edge of the road (not represented in training data)
ALVINN - Fixing the problem
Used geometry and knowledge to simulate images near the side of the road using existing images. Used computer graphics to make these extra test cases, and then calculated what the desirable steering direction would be, and added that to the set.
ALVINN - Reality Check
Why just not write the code using the targeting used to create the steering direction for the synthetic images?
Reinforcement Learning
Learn control based on experience - often used to get robots to do things we don\'t necessarily know how to do (walk).
Reinforcement Learning - Intuition
Behavioral Feedback - easier to tell a robot \"good\" or \"bad\" than to tell it explicitly what to do.
Reinforcement Learning - Generalized Algorithm
Specify a task, give rewards for task-like behavior. Let the algorithm figure out the details.
Reinforcement Learning - Simple Case
Confine the world to a set of discrete states, with discrete, deterministic actions. World is fully observable. Discrete time steps Toy case, but can relax restrictions to be more realistic.
Markov Decision Processes (MDPs)
A type of reinforcement learning. A set of states S Set of actions, A Transitions T:SxA->S Rewards R: SxA->|R
Markov Decision Processes Rewards
Rewards measure the immediate utility of each action (greedy).
Markov Decision Processes - Initial Problems
Not enough to do the greedy action at each time step - something good now could screw you in the long term. i.e. candy followed by a tiger.
Reinforcement Learning - in terms of MDPs
Learns a value function, a function of the sum of rewards over lifetime. V: SxA -> |R : long-term value.
Q-Learning : What does it do?
It learns a state-action reward function.
Q-Learning Function
Q(s,a) = R(s,a) + MaxQ(s\',a\') over all possible a\' The reward of the current state plus the reward of the best possible next state.
Q-Learning - Dealing with Infinite Loops
Use gamma - the discount factor Gamma: Probability of living to the next time step Makes the math work out, and avoids infinite sums Reward of a loop = r/(1-gamma)
Dynammic Programming Algorithm to Calculate Q-Values
Sample from the world, use Q learning to decide actions Store Q-value approximations in a table, and start the values in this table randomly
Dynamic Q-Learning Equation
Q(s,a) = (1-alpha)(Q(s,a))+(alpha)(r+gamma*MaxQ(s\',a\') over all possible a\')
Dynamic Q-Learning Meaning of alpha
Alpha is essentially the learning rate alpha = 0 : Never believe what you see. alpha = 1 : Always believe what you see.
Dynamic Q-Learning Cost/Benefits
Converging to the actual Q-Function will take an infinite number of actions. Works well in practices, because it is more about the relative ordering than specific values. Specify the task with rewards: low values for bad things, high values for good things.
Post-example Additional Issues with Q-Learning Dynamically
Works well on small problems where it is easy to specify the task, \"optimal\" behavior Problems may have many states and actions - hence, a lot of experience is needed. Random actions - some actions have catastrophic risk.
Q-Learning Current Research
Continuous spaces, risk aversion, large state spaces, prior knowledge, addressing training time and data requirements.
Robot Teams - Why?
1. Can be in multiple locations. 2. Can divide & conquer tasks 3. Sensor cross-calibration 4. Fault tolerance 5. Simpler robots = less computation power 6. Faster (sometimes)
Robot Teams - Problems
1. Infrastructure: charging, robots break. 2. Deciding what to do is hard: 4 states over 10 robots = 4^10 robots 3. Emergent behavior hard to predict, code 4. Too many robots can lead to sensor and physical interference. 5. Communication problems: some internet protocols can lock robot systems - security and reliability also an issue.
Robot Teams - Current Research
1. Search and rescue 2. Mine detection and detonation, including surf-zone mines. 3. Military vehicles: Future Combat System to automate supply lines. 4. Intelligent Cars: make all cars on the road a team, to minimize travel time.
Robot Teams - Robocup Date
Started in 1994, 700 Teams last year, goal of a human vs robot match in 2050.
Robot Teams - Robocup Leagues
1. Simulation League 2. Small sized robot league 3. Middle sized robot league 4. Legged robot league - Sony Aibos 5. Humanoid League: Aldeberran Nai ($5000)
Robot Teams - Foraging Task
Robots to gather pucks distributed in a task space. Two types of robots: 1. Search for pucks and put in zone 2 ( a larger area around the goal) 2. Moves from zone 2 to home
Multi-robot Control: Paradigms
1. Centralized controller (master/slave) 2. Distributed control 3. Auction based task allocation
Multi-Robot Control - Centralized Controller definition
One unit in charge, controls other, perhaps in a hierarchy.
Multi-Robot Control - Centralized Problems
1. Single points of failure 2. Explicit coordination problem 3. Scales exponentially with the number of robots. 3. Inherent problem with shared reference frames
Multi Robot Control - Distributed Control definition
Each robot decides on its own actions, possibly in collaboration with others
Multi Robot Control - Distributed Control Problems
Peer to pper communications can be unreliable Dense communicative topology Emergent behavior hard to predict
Multi-robot Control - Auction-Based task Allocation - Basic Idea
Split up tasks into sub-tasks: 1. Auctioneer actions out subtasks. 2. Robots bid on sub-tasks 3. Lowest bid gets task
Multi-robot Control - Auction-Based task Allocation - 1 Round simple auction
Greedy scheme, cannot guarantee optimality. More-advanced: sub-auctions.
Multi-robot Control - Auction-Based task Allocation - Robot setup
Robots have lists of requirements, capabilities: sensing, actuation, computation - match task capabilities to robots, if you can fulfill capabilities, you can make a bid
Multi-robot Control - Auction-Based task Allocation - Bids
Bids are broadcast and based on each individual robot\'s idea of how much time or power it would take to complete the task.
Bid Decisions
Based on: 1. Location 2. Abilities 3. Battery life left
Multi-robot Control - Auction-Based task Allocation - Problems
Works well in the naive case. Problems: 1. Collisions in physical space 2. Combinatorial auctions might be better: choose sets of auction items (but what\'s the best currency?) 3. What about targets of opportunity and rebidding?
Multi - Agent Research
1. Robotics work tends to use simple auctions 2. Experimenting with combinatorial, multi-level, multi-objective auctions 3. Its where economics theory overlaps with robotics - except robots are all purely logical agents so the theory applies even better.
Human Robot Interaction - Big Questions
How do we get humans to interact with robots in an interactive way? Even humans who are not familiar with robots?
Human Robot Interaction - Advances in Healthcare
1. Biggest area for robotics growth 2. Robot walker for blind - navbelt 3. Robotic walkers, wheelchairs 4. Stroke therapy - robotic arm makes patients go through therapeutic motions. 5. Nurse bot
Human Robot Interaction - Humanoid Robots
1. MIT Cog 2. NASA\'s Robonaut 3. Nurse bot
Human Robot Interaction (HRI) - Interaction Roles
1. Operator 2. Team-mate 3. Mechanic/Programming 4. Bystander
Human Robot Interaction - Operator Role
Human directly controls the robot via tele-operation, tele-presence. Questions: How do you keep awareness of what around you with limited sensors? Camera can feel like blinders.
Human Robot Interaction - Teammate Role
Human collaborates with a robot on a shared task - proximal interaction. Questions: how do we make this seamless? Cues in human to human interaction hard to quantify.
Human Robot Interaction - Mechanic/Programmer role
Single robot, human is fixing problems with it. Questions: How much can the robot do, can it tell the human what\'s wrong?
Human Robot Interaction - Bystander Role
How does the general public in the vicinity of the robot interact with it? Huge reliance on social context, which can change dynamically. To do this, we draw ideas from social psychology.
Human Robot Interaction - Bystander Role In depth
1. How do we get robots to interact with people who don\'t have a technical knowledge of how a robot works. 2. People are uncomfortable when they can\'t predict social situations - social context scaffolds human interactions
More Bystander Bullshit
Theory of mind - there are states and transitions Performing arts - what can actors tell us about human interaction - they are intimately involved with it.
Robot Vision - Cameras as Sensors
Arrays of pixels that detect photons, essentially a 2d array of bukets that count photons, which can overflow and saturate vertical lines Operate at 30 or 60 Hz - modern cameras are not interlaced.
Robot Vision - Cameras as Sensors, Advances
Today: Webcam $90, USB interface, 2 megapixels 10 Years Ago: $10,000 for 640x320 grayscale, custom hardware
Robot Vision - Cameras as Sensors, Data format
Many multi-byte formats: RGB/HSV/etc...
Robot Vision - Vision is Hard!
How do we recognize objects, track objects, derive sematic meaning (mapping pixels to an actual idea of an object) cameras give syntax, how do we get semantic meaning?
Robot Vision - Cameras as Sensors, Problems
Too much data: some cameras have a higher bandwidth then some computers can deal with. 2 billion bytes/sec!! cameras on robots shake
Robot Vision - Stereo Vision
Use different placement of objects in stereo cameras to detect distances, but how to you recognize the object being the same in both cameras?
Robot Vision - Optic Flow
Temporal image sequences - a vector for each pixel makes up a vector field. These \"flow fields\" can be used to avoid obstacles
Robot Vision - Face Detection
Use Viola jones algorithm: uses a set of Haar-features and determines the most facey patch. Linear combination of Haar features with weights associated to it, weights can be learned.
Robot Vision - Polly Algorithm
Simple, low-hardware solution to cooridor navigation. Differentiates between walls and floor to align the robot along the center of the corridor.

Deck Info

72

kleptocrat

permalink