Android/Mobile Microconference Notes
Welcome to Linux Plumbers Conference 2015
The structure will be short introductions to an issue or topic followed by a discussion with the audience.
A limit of 3 slides per presentation is enforced to ensure focus and allocate enough time for discussions.
Please use this etherpad to take notes. Microconf leaders will be giving a TWO MINUTE summary of their microconference during the Friday afternoon closing session.
Please remember there is no video this year, so your notes are the only record of your microconference.
Miniconf leaders: Please remember to take note of the approximate number of attendees in your session(s).
SCHEDULE
8/20/15
Organizers:
Karim Y.
Rom L.
John S.
Topics for session 1:
Upstreaming updates
State of staging (Greg Kroah-Hartman)
USB gadget and ConfigFS - status & future (Andrzej Pietrasiewicz)
Barriers for running mainline kernels on consumer Android devices (John Stultz)
Single binary image for multiple devices
Android, partitions and customization (Rom Lemarchand)
Running a single Android binary image on multiple devices (Samuel Ortiz)
Adapting Android for Ara (Karim Yaghmour)
Integrating KDBus in Android (Pierre Langlois)
Topics for session 2:
Toybox in Android (Rob Landley)
Improving vendor AOSP repos (John Stultz)
Providing per-task Quality of Service (Juri Lelli)
Improving big.LITTLE on Android (Tim Murray)
Notes - session 1
Total # attendees: 60
Total # participants: 20
Introduction
- Schedule is slightly different from posted conf schedule
- Last year significant accomplishment: binder driver moved out of staging
- Key ppl here: Greg K-H, ARM, Intel ppl
State of staging (Greg Kroah-Hartman) 1:30
- Last year we merged binder from talk
- Here's what we have in staging
- - Ashmem - will keep for now
- - MemFD - would like to move to this
- - One missing feature was unpinning (John Stultz)
- - Need to keep in staging for now
- - Greg hasn't heard any bug reports about it
- - TImed_GPIO & timed output
- - Last yr: No one using any more. True?
- - Vib HAL: Using timed output? Came in L. Before was directly calling timed_GPIO
- - Transient trigger LEDs can be used
- - Legacy drivers may be using & could remove after
- - Greg will yank
- - Low mem
- - Odd patches from Intel about this
- - People unhappy w/ it. Breaks on intel 915?
- - Greg getting bug reports
- - Riley Andrews has some ideas but not had time to look into it
- - Greg to forward the intel emails for discussion
- - Who can maintain & track down?
- - John Stultz (Riley Andrews)
- - Sync
- - Greg was told ppl would use
- - Sync 9 mostly for graphics
- - Jesse B: Still have plans to do it, but need to have open source user space for it. Getting API from sync, it seems OK. Think Wayland may start using
- - Daniel: No one currently working on de-staging. Intel team can take it once they have a user space. Need for DRAM
- - ION
- - Is it getting used on new stuff?
- - Getting more users than would like
- - ION & buffer sharing need 2 drivers
- - What's replacement? There are proposals but no agreement. May discuss further at a BOF tmrw
- - Scary: Getting patches to add new features. Greg is stalling
- - Daniel: No upstream users that actually need. API is not there w/ needed features. Buffer sharing in ION: Keep stalling until get new drivers.
- - Would anyone be upset if Greg keeps stalling? Probably some Intel ppl that Daniel doesn't know.
- - Maybe need to highlight that needs more discussion
- - As long as it's there, ppl will keep using. If we take it away, then ppl will all go off & do their own thing & won't have visibility.
- - Suggest to wait until BOF discussion tomorrow
- - John S: probably gonna have to keep it in staging
- - There are a number of features not ever submitted upstream
- - Not talking about SOC-specific, talking core. e.g. netfilter
- - GadgetFS stuff is being worked on & pushed upstream (MTP)
- - cgroups
- - Rest is probably not easy to push in
- - Would like more reviewers on KDBus
- - Is SW sync included in Sync? Yes, subdriver. Should drop SW sync before merge. Would like user space test interface - keep only for testing.
- - Maybe expose in debugFS - should do before moving out of staging
- - Conclusion:
- - Will delete timed_GPIO with some fanfare
USB gadget and ConfigFS - status & future (Andrzej Pietrasiewicz) 2:00
- host & device can both be systems running linux
- today's focus is device
- USB functions are not same as in prog lang. They are groups of interfaces. Several functions implemented in kernel wh/ can be composed into gadgets.
- If you want gadget w/ some configurations & some functions, you must create a kernel module. So, have to do at compile time.
- Would like to be at runtime - this is where ConfigFS comes in
- Legacy gadgets compiled as modules will differ in some ways from those composed with ConfigFS
- LHS of slide 2 shows how legacy gadgets work
- - From user point of view, the workflow is just 1 step. Insert kernel module & stuff happens (see slide). Either we succeed or not.
- RHS of slide 2: configFS
- - There is no 1 moment in time when user does action. There are a number of actions that user must take. User must create each function directory explicitly.
- - Then, gadget needs to be enabled.
- - Some resources are allocated upon dir creation, others allocated at bind
- Table on slide 3 summarizes what happens when
- - what does he mean by "ln -s"?
- - functions can be used among different configurations b/c only 1 can be active at a time
- - make symbolic links to associate a function w/ a configuration
- Discussion (slide 4)
- - backward compatibility: Some legacy gadgets are special (e.g. ethernet for certain OS only handles USB configuration 0 [up to version 7?]). So USB 0 must somehow be shared & maybe we don't want this sharing.
- - Do these questions apply to Android use case, or in general?
- - When in your UI will you see network interface? So think should acquire all resources needed for userspace should be done at directory creation time & currently not done at that time. Think user space will want.
- - Badhri: Agree can do resource allocation during mkdir - would make more sense & require less time to switch between configurations. In Android, user gets to choose if device must work in particular configuration e.g. do you export pictures or storage. It's all based on user actions. Is there something in code that prevents us from doing it?
- - Andrzei: config FS has nothing to do with that. It's a design decision
- - Badhri: When migrating from Android composition driver to configFS, what should be putting in alloc instance vs. fucntion & also bind. Are there guidelines?
- - A: boils down to what goes into each function. No guidelines
- - B: Why instance & function
- - A: Don't like the naming
- - ?: Currently it's all hardcoded that you're getting USB 0
- - A: Gadget bus has been proposed by Linux USB mailing list. If bus is created, sysfs dir could store the info
- - ?: Need to expose info so can trace back i/f to function instance
Barriers for running mainline kernels on consumer Android devices (John Stultz) 2:25
- Why?
- - Moving to a new kernel won't bring benefit to user. But for the kernel & android devs it helps.
- - Experimentation: e.g. low memory
- - But it can be up to a year before the android devs pick up the kernel with the functionality
- - Many dev boards run mainline, but are missing desired hw
- - Collaboration improvement b/c can catch issues sooner & have active discussion, rather than 1 yr later when they have a shipping deadline & don't have time to discuss w/ community
- Blockers
- - HW
- - Unlockable bootloader great b/c gives
- - Most vendors will disable b/c optional
- - Headphone debug UART cable spec has been published & would like to see more people use this. Most devices have headphone jack.
- - Some vendors concerned b/c could put noise on audio path. Audience: Confirmed it's real
- - USB-C Maybe some lanes of superspeed could be used, would like to see some serial output standardized
- - Avoiding binary blobs
- - Firmware that's not redistributable
- - Android kernel patches
- - many out of tree
- - since 3.4 have been able to run android on top of mainline for things liek qemu, pandaboard but has no sensors, no wifi, no bt
- - Good progress in git delta even with big functionality in 3.10 added
- - Sticky points:
- Where APIs to kernel are vastly different
- - Lagging upstream soc support
- - tons of code in vendor trees. The blue is same as from prev slide, which is dwarfed. Graph of 3.10
- - Need to develop DT files
- Nexus 7 status
- - Booting to MMC, to shell
- - configFS gadget working
- - touch panel input
- - gpio buttons working
- Bjorn & Rob Clark have gotten their displays up & running
- - Bjorn may have wifi (confirmed) & modem workingdi
- TODO common across a lot of SOCs
- - upstream complex or not complete enough to be able to work with
- - ?: can run android bluetooth on top of upstream
- Q: binary blob - many qualcom snap dragon chip. Hexagon is qdsp6. most binary blobs running on ... can objdump and reverse engineer
Android, partitions and customization (Rom Lemarchand) 2:40
- "war stories on android one"
- What does family mean?
- - Same SOC, maybe slight variations in HW (sensor, flash)
- wish: bootloader doing HW detection & populating DT
- Generic code: means for your particular platform (e.g. ARM, Intel)
- /odm introduced w/ Android One
- - Is it signed? Yes
- - Who signs it? "We"
- - Would allow over the air updates with different signatures? Yes. Would need to provide key
- - Does it contain kernel modules? Yes, it could.
- - Kernel modules signed? Compiled & signed & maintain in kernel tree, then odm's can pick which they want to use
- - Would odm's be able to sign their own kernel modules? not at moment, in this context
- /oem
- - customizations e.g. ringtone, backdrop
- Mark: testing new kernels on ... have to re-flash system partition & then disable verity messes things up. Would like to have kernel modules in an unsigned partition b/c they are already signed. Would like it for updating to new kernels
- Where do you put the certification for customizations for carrier - typically those are AOSP change?
- - Typically negotiation w/ carrier. Not aware of what changes are being made for carrier.
- If we added ability to load modules through adb, would it help - insmod from host?
- - would be interesting, but not too helpful. Usually need to update whole kernel & test.
- Have you done build system changes to enable this?
Running a single Android binary image on multiple devices (Samuel Ortiz) 3:00
- Work that's been going on at Intel Open Source Technology Centre
- IRDA: Intel reference design for android
- - Similar concept to Android One, but for tablets on x86 silicon
- - John's wish but cheaper: $100 tablet with UART that's likely to run on upstream kernel
- - Samuel to tell John which ones
- - Not supposed to change AOSP
- - Fairly small HW differences: GPS, wifi, bluetooth
- - Initially running on KitKat, so no vendor partitioning
- Why so hard?
- - The way the build system is designed
- - BOM = bill of materials (HW definition of device). i.e. even changing 1 device means changing image
- How to make it more scalable?
- - proper way to load kernel modules & dependencies (no modprobe, insmod limited)
- - how to dynamically select different HAL? e.g. if you have 2 GPS on your devices, then will have 2 HAL
- - HW specific binaries e.g. wifi different for realtek, for broadcom
- - Android firmware depends on many properties that are set at build time
- - Android uses permission files that describ`e HW features on the device (e.g. ambient light sensor needs you to install)
- - HAL-specific configuration files need to be installed at build time
- - Want these to be more dynamic & not build time
- IRDA autodetection
- - all uefi and acpi tables will be right & entirely describe the HW & if wrong, will work it out with vendor
- - for every device acpi enumerating, would trigger a bunch of actions described by autodetection records
- - The only AOSP modification was libhardware
- - if want to use compass, use libhardware to give you sensor HAL
- - libhardware talks to autodetection daeomon to figure out which HAL
- - many video graphics binary blobs are HW dependent (e.g. b/c codecs) - so, install into different directories, then bind mount the right folder
- - implemented a FUSE driver
- - etc permissions completely FUSEd
- - start with empty, then when boot & enumeration is done then will have exact set of permission files describing what firmware has told
- Examples of autodetection records
- - Bosch BMC150
- - Detected that it's magnetometer from Bosch & then know need to load iio-sensors-hal
- - then sets a bunch of properties that are completely HW-specific that the HAL needs
- - The fuse permission file is listed
- - wifi
- - Intel chipset on SDIO
- - Real enumeration, not ACPI
- - do a bind mount - see 3rd line: lnp to be bind mounted
- - Will have 3 different wpa_supplicant & if detect that it's intel, will use the intel-see to help them with this
- - Better ref HALs
- - hacking HAL dynamic selection b/c vendors feel like they need their own HAL b/c no good reference HAL out there
- - Have tried to work on it for bluetooth & upstreamed so that can base off standard AOSP one, so don't need another vendor-specific HAL
- - built-in kernel module support
- - If Android could dynamically load modules & their dependencies, that would be great
- - Resources...
- - Need to indicate at build-time that have ambient light support, but even if firmware says otherwise, it will think ALS is there
- - Should ask the HAL if ALS is actually there
- - Karim: this is not geared towards HW changing at runtime, just geared towards 1 image
- - Samuel: It could work for dynamic changing HW (e.g. remove something from USB bus). Need to remove the relevant XML file to let the framework know.
- - Karim: Some parts of framework don't like if something disappeared
- - Layered Android
- - where you have several partitions where you differentiate soc & vendor specific binaries that could be interchangeable
- Q: John: Which is worse reference HAL?
- - A: The ones that are binary. Most GPS vendors will give you binary HALs. If Google provided a sophisticated GPS HAL, the vendor would use the AOSP HAL b/c their business is HW not making a HAL. Bluetooth is also a problem & it's several 1000s LOC. NFC requires ~60k LOC. Graphics. Can't point to any 1 AOSP HAL being widely used.
- - Some are really bad. Some not meant to be re-used.
- - Audio HAL is relatively good, though no one is shipping it directly. Many are using copy w/ the reconfigurability provided. Fewer tweaks than before - progress.
- - If knew how to write extensions to HAL would be helpful.
- - There are no wifi HAL, there's supplicant & then vendor libraries
- Q: Dimitry: Did you see what is time change with automatic discovery? What was impact on boot time?
- - A: Depends on SOC, but could get it to <500ms, which is good. Haven't seen any race condition.
Adapting Android for Ara (Karim Yaghmour) 3:00
- Big deck, skipped many slides
- What's the philosophy & where is it trying to go?
- - Want to bring variety to HW ecosystem
- - Would like to show demonstration of scale to market, not just prototype & have module ecosystem that consumers can buy
- What module maker is expected to deliver p.19
- Some interesting technologies underlying Ara
- - Most important is UniPro
- - Think of hte phone as network switch w/ modules interconnected w/ UniPro protocol, which is a MIPI standard
- - it's still a work in progress (some stuff has to be guessed in the spec)
- - Capacitive contacts (instead of metal contacts) to avoid wear & tear
- - Think of endoskeleton as a big UniPro switch. AP doesn't need to be involved in communication
- - Tried EPMs for locking modules in place, but won't be doing that (this is public info now). Not good if you dropped the phone.
- Back cover of every module can be printed on for customization
- Greybus is probably the most interesting thing on kernel side
- - what has been layered on top of UniPro
- - Closest you can get to a real Ara phone now is the gbsim (the URL is out-of-date)
- HW architecture
- - switch in middle is the endo skeleton
- - Currently don't have SOC with unipro, so have to go over USB
- - Supervisory controller is a uctrller in the endoskeleton with job to be master of bus to notify when new module connected
- Module
- - can have several interfaces, each w/ own CPort (think of these as sockets)
- SW arch
- - Initial design had kernel with greybus support inside & the HALs are dynamically loaded for each type of device
- - greybus
- - Took HALs from android & created greyubus device classes with those
- - there are some thigns that speak other languages. Would like to bridge those over unipro.
- - Rom: PCI not on list?
- - Karim: Not a coincidence...
- - on kernel side
- - the greybus subsystem can be thought of as USB subsystem, but for unipro
- - no chips with native unipro yet, so using AP bridge controller hooked up to USB infrastructure to talk to USB subsystem on other side
- - Do you have to ocmpletely re-write driver if it already speaks IIC?
- - John: Idea is to have as little change as possible
- - What is smallest possible changes?
- - What about subsystems that aren't integratable?
- - Karim: Module will be detected by uctrller & tells the AP & AP can decide what to do
- - Becomes enumerable on greybus. Feeding hotplug events to ueventd
- - Ask Greg Qs about greybus
- - on Android user space side
- - The subsystems don't know how to deal with changes
- - 1st proposal was device manager & then wanted subsystems to talk to it. Jeff Brown NAK'd.
- - next proposal: let HAL layer deal with it
- - ended up with Endo Manager. Just let the subsystems think that the module is there & feed it what it would expect.
- - e.g. no camera there, camera app sees blank. Then plug in camera & see the feed. Android believes there's a camera that's not there
- - What if you bound physical HW address, then plug new in (e.g. bluetooth, wifi). Those have SW on/off. But doesn't cover stuff like bluetooth headset pairing if you remove the module.
- - What about sensors, where do they live? e.g. rotation sensors will always live in the display, which is not removable.
- - Won't this make for expensive displays?
- - Trying to change subsystems at once is not realistic, but could work if it's gradual adaptation. Have to work with each subsystem maintainer.
- - "Class" Manager
- - General idea is to provide fake devices until the real device is attached
- Recap of the different related talks:
- - Android one solves by different partitions
- - x86 solution is to feed off the firmware/bootloader & then Android chooses differently
- - Ara has to address challenge that things will change down the road
Integrating KDBus in Android (Pierre Langlois) 3:55
- Rob: Did Linus shoot down KDBus on vanilla side?
- - ?: Linus wasn't convinced, but he trusts Greg
- - ?: He doesn't have to worry about it until next merge window
- Why work on it?
- - Seemed like a good time to work on it for Android
- Ideally you can take project in Android & replace libbinder with libkdbinder
- Binder API
- - Example adds 2 integers together
- - Doesn't matter where the integers are added
- - Have 2 implementations of the adder - remote & native
- - remote proxy example - what to do if have to add special service
- - shows what have to implement on top of KDBus
- Q: What are the binder limitations such that you need to use KDBus?
- - A: It's to share code in kernel
- Q: Performance impact?
- - A: Want to find that out
- Security discussion?
- - OK if Android. Don't do it on another system.
- - Think it's good if can use something that many have looked at
- Kernel driver always knows what has new threads
- Reference counting so can tell when services are dead
- KDBus overview
- - Mount it to get a virtual file system
- - Bus will keep track of connections
- - Each connection will have an ID
- - Everything is packaged into a message
- - Can do both synchronous & asynchronous transactions
- - Note: Timeout on asynchronous is just on client. Will deliver in order.
- - Synchronization guarantee: maintains order
- - we have all we need for synchronous transactions
- libkdbinder
- - Got rid of traditional service manager
- - Done in simplest manner could think of to start with
- - Q: What can you put in parcel in your implementation?
- - See future work section
- p.50 - was trying to get surface flinger working & started adding support to send a file descriptor. This is the next step.
- - John: Think should just work
- - ?: Should work, but can't map memory. Would need memfd.
- - Sealed memfd descriptors, so when you get it you know it's yours
- - How did you pass performance tests if you don't have asynchronous transactions?
- - Didn't
- - p.61 backup slide - believe only have done strong reference count
- - Discussed various things for future work (hard to hear...)
- - Expect ashmem should just work
- - There's nothing wrong w/ 1 connection per service
- - ?: Binder is insanely fast, so getting to binder efficiency will be hard
- - ?: Binder running into scalability issues
- Where is the code?
- - Would like to publish eventually & let Linaro pick up
- - John: Would like to see it soon b/c relevant to discussion about upstreaming KDBus
- What testing is being used? That one test has holes
- - Nothing substantial, but thinking getting surface flinger to work would be a good starting test
- - Can't break applications that have been around
- - Would like to see CTS test
- - Some companies will run 1000 most popular applications for you
- - There's no formal test to make sure all applications will work
Notes - session 2
Total # attendees: 38
Total # participants: 7
Toybox in Android (Rob Landley) 5:00
- See the URL for links to previous talks on Toybox
- - 2013 was the why toybox talk
- Super quick summary
- - Most of the audience not familiar with Toybox
- - In 1999 wanted to build linux from scratch & then Linux from scratch came out & use a floppy
- - In practice, busybox really replaced 4 commands in LFS 3.0 (theoretical was 22)
- - Often rewrote the busybox implementation from scratch for Aboriginal Linux 1.0
- - Check out the link to the "what's next" for Aboriginal Linux after 1.0
- - Want simplest system & to make implicit assumptions explicit in script
- - When GPLv3 happened, he had to audit the code base
- - Some of the code was GPLv2 only & couldn't be GPLv2+
- - Concluded GPLv2 only
- - Then handed off busybox maintainership & started on toybox
- - Didn't feel right to undermine busybox b/c chose the new maintainer & busybox had a 10-year head start
- - Looking at computer history low/high end of memory in systems & Moore's law
- - We only get new OS when we get new HW platform when something about the old HW platform makes it need to go away
- - Capacity constraints
- - Mainframe -> mini computer -> micro computer -> smart phone
- - This means smartphones will become the new workstation (with USB to give you the ergonomics) but the SW isn't there
- - Kick off is when system becomes self-hosting & don't need cross compiler
- - Don't want Android to be beaten to this by iPhone. Want a multi-vendor solution.
- - What has kept busybox & GNU tools out of Android? No GPL in user space (GPL v3 broke "the GPL")
- - busybox can't be re-licensed, but toybox can
- - Took the Open BSD template license & removed 1/2 line that required the copying of license text (it can get out of hand e.g. Kindle papwerwhite 300+pgs of license boilerplate): public domain disguised as BSD
- - Could apply goal of simplest system rebuilding itself under itself to Android
- - Wanted Android to grow to become a development workstation
- - Expect smartphone price for performance to beat out PCs b/c of volume
- Karim: How far are we from busybox?
- - There is a Toybox status page that is auto-generated by comparing against roadmap page: http://landley.net/toybox/status.html
- - Roadmap for Toybox. http://landley.net/toybox/roadmap.html
- - Need an Android test environment (Eliot is testing vanilla un-modified Toybox & syncing about every 2 weeks)
- - less?
- - It's still in pending, but being tested by Eliot
- - Pending means >1hr of work to clean up
- - Need to implement POSIX-2008
- - Need Linux standard environment
- - Need development environment: have some missing commands missing still (see roadmap page)
- - Need to replace Android toolbox
- - Tizen is interested in Toybox b/c of GPL cleanup: need smack support & Tizen test environment
- Karim: Do other people help in cleanup & code review?
- - Yes. Made cleanup page to help teach ppl how to do good cleanup. Accepts patches & tests
- - Android team is drinking out of fire hose with vendor patches, while needing to focus on next release
- Karim: Given that Toybox is adopted in Android, what has it not replaced in Android today?
- - List of commands not being used in Android
- - make clean
- - make defconfig
- - make
- - toybox -> will tell you all the commands implemented in toybox
- - look at toybox/android/Android.mk to see what they are enabling (by source file)
- - See the page http://landley.net/toybox/code.html
- - ps is about 2/3 complete & will add regex filter extension, among others
- Amir: what would start/stop do?
- - Rob:
- - Start service, stop service
- - 1.0 Toybox will be when it's relatively comprehensive
- - post-1.0 items will be more tricky work
- - repo is based on git, git is GPL & need to replace git with a non-GPL version
- - init is a hard problem
Improving vendor AOSP repos (John Stultz) 6:00
- No non-Nexus device support means it's harder to upstream
- Because there isn't the review culture, you don't get discussion & also get code festering in local repos
- - Rob: Gerrit reviews are corporate internal
- Samuel: Are you suggesting vendors should submit HALs to AOSP?
- - John: Not necessarily, but there isn't as good a sense of what's good/bad for AOSP. Also, not able to see commonality between vendors to come up with something unified
- - Karim: Also licensing is a consideration
- Laura: Have you found problems with tightly coupled kernel & user space?
- - John: pretty common, including ABI breaks
- Bero: Would it make sense to have some place to collect submissions even if they don't get into AOSP
- - John: Don't want to create another AOSP fork to solve the forks, but curious to see what people think
- - Karim: trashcan slide is also relevant problem to embedded
- Karim: boot-camp refers to the get-together organized by ?
- - Mark: They only let 2 representatives per company, so got to go for system day & security day. After it was done, there was a bunch of collateral. Is that publicly available?
- - Sumit: The members who were invited should have access, but not public
- Karim: Do you have a sense of interest by Google or vendors?
- - John: Hoping to find out in discussion
- - Laura: Unless it's mandated, vendors are likely to skip it for speed, unless it makes things easier for them
- - John: Don't think they are doing thing badly intentionally, they are just trying to get things to go in absence of guidance
- John: WHat other pain points that docs could help?
- - Amir: security updates would force vendors to reduce integration friction
- - Samuel: Can you force vendors to do security updates?
- - John: Don't want to force. How can we make things easier & more collaborative to avoid the franken fork update pain
- John: Would be great if could get away from device directories & could move more towards something like kconfig. That would help get rid of duplicated lists of necessary packages if you could be more generic
- - ?: Where would it live?
- - John: Not sure. Would help if there was a proof of concept for it.
- - Karim: Cyanogen mod has a tree that they continue maintaining forward
- - ?: There are a lot of ifdefs in Cyanogen mod & there's no dependency tracking
- - Karim: Good thing about having drivers in Linux kernel is that someone has the old HW in their basement that they want to keep working.
- - Talk to Colin Cross
- Laura: Do you think vendors are aware that HAL layers might be a problem?
- - John: Don't have a sense of what the vendors think
- Dimitri: How does Linaro handle cases when they don't have the boards?
- - It is a problem. Not possible to validate everything
- - Sometimes it's delegation of the testing - e.g. maintainers don't have all the HW
- - Maybe could set up a testing tree that vendors could pull for testing. However, vendor trees collide very badly & it will be painful to resolve
- - ChromeOS has been doing successfully - should talk to those guys
- Rom: Trouble with vendor trees, is that no one even builds the tree in the same way
- - Mark: If you build kernel within Android tree, you end up with coupling between kernel & user mode. Like having kernel built externally.
- - John: There's GCC4.8 issue
- - Karim: Can use 4.6
- - Greg: Had to patch out the bug in 4.8
- Rom: WOuld like to see more contributors to Android project so that there's more visibility
- - John: May be de-motivated if code is not going up. How is review load?
- - Rom: Some teams are better than others at bringing in patches
- - Amir: who to contact?
- - Greg: When you upload change to AOSP, Gerrit will auto-add the appropriate reviewer
- - Amir: What if no reviewer is added?
- - Oren: Have seen same issue
- - Greg: Need to investigate when get back to make sure the team reviewer lists are up-to-date
- - Rom: They all land in Rom's inbox
- John: Improving culture of review idea: Anyone can review in Gerrit. Maybe we can get a random group of internal reviewers to just review stuff & see what can learn. Might help to grow some external maintainers.
- - Rom: Also would like to see people's reviews to learn who's reviews to trust
Providing per-task Quality of Service (Juri Lelli) 6:40
- SCHED_DEADLINE - several yrs of work
- Last year's LPC discussion on scheduler
- - how can user space provide more info to scheduler?
- - how can scheduler make better use of that info?
- ~15% of room has heard of EAS (energy aware scheduling)
- - EAS is the full stack
- - Foundation patches are pretty stable & mostly ready to get merged
- - Per-Entity Load Tracking
- - dealing w/ big little platforms & want to make use of signals
- - Riley: What's window?
- - Juri: 1 ms
- - Kapileshwar: y^32 = 0.5 to slowly decay
- - Energy model
- - how to schedule on which cpu, pack pins
- - contains info on p states & c states
- - SchedTUNE (released yesterday)
- - Extension to interface between user space & kernel space
- - Maybe for certain workloads, you want to run certain tasks boosted
- - can work at global level (sysfs entry) or with cgroup & can have hierarchy of cgroups & have different margins
- - John: From Android interactive CPU governor (cares about latency), where does latency come in - knobs?
- - Juri: If you care about the time for ramp up, can use this interface to boost the signal
- - Riley: Interactive just does it automatically
- - Juri: Trouble is that interactive does it a bit blindly. In this case you can pick which care about latency.
- Difference from Linux mainline scheduler
- - Doesn't know if you have big little CPUs
- - Only accounts for performance & not doing tradeoffs
- Sched-DVFS
- - governor can be triggered from scheduler, so scheduler can decide which task on which CPU & which OPP
- - for ARM have kthread activated b/c need to sleep when change frequency
- - Only triggered by CFS scheduler, but tomorrow will discuss how to extend to consider Linux scheduler policies
- Discussion
- - Riley: Please describe the EAS tunables (knobs) b/c really hard to get them right
- - Juri: HMP had too many knobs, so didn't know if changing a knob would do the right thing. With SchedTUNE, trying to make it more usable by putting more intelligence inside the scheduler to help you get it right.
- - Bobby: Is scheddvfs reacting fast enough?
- - Juri: While you're scheduling, you can ask it to be more reactive. Removes knob for rate of sampling b/c depends on type of workload running - scheduler figures it out
- - Riley: Must still be interactive for governor in place?
- - Juri: Actually, handled in scheduler & depends on how signal changing.
- - Riley: Actual placement of tasks - using geometric decayed load to make judgement of where task goes. What heuristics used? Have the nasty tunables from HMP disappeared?
- - Juri: There are heuristics & need to make guess & compute energy diff. Will trade off between energy efficiency & performance.
- Bobby: WIll tomorrow go into more detail? Talk to Morten, will be discussed in EAS-1
- - Kapileshwar: Can also look at patches
- Deadline Scheduling
- - Currently helping to maintain Linux deadline scheduler
- - Would like to hear your suggestions for features
- - is a real-time scheduling policy
- - SCHED_FIFO: may starve other tasks. This avoids starvation
- - Enriches the scheduler by userspace communicating latency requirement of specific tasks
- - p.23 can be applied for different types of tasks (can be sporadic or periodic)
- - Can associate a deadline (real latency constraint)
- - p25 maybe you have an audio pipeline, so you know the average execution time & then you'd write C code to specify these 3 parameters (runtime, period and deadline)
- - p.26 played movie with 6 other movies on same platform. CFS not always giving required quality.
- - Discussion - Where can we use this in Android?
- - High performance audio:
- - We can use this to specify that we only want to use 5% of CPU for 1 thing
- - Surface flinger
- - Can schedule multiple at same time?
- - John: HOw does sched deadline interact w/ cpufreq?
- - Juri: Doesn't interact. Need to integrate different policies. Will set a min freq to run at. Depends how you aggregate the different requests. You have to specify your highest OPP too.
- - John: COuld you just set it to lowest & not deal?
- - Juri: Can use different capacity states (what's right OPP)
- - Riley: Deadlines relative to other deadlines?
- - Juri: WIll do isolation
- - Riley: Higher priority than FIFO
- - Juri: Will affect FIFO. Unlike FIFO, you can say you only want to give SurfaceFlinger 10% capacity max & have QoS guarantee
- - Riley: Was trying to look at it for SurfaceFlinger last year but __ in usec
- - Juri: __
- - Bobby: Why is usec significant?
- - Riley: It's kind of useless if can't assume at lowest performance level
- - Juri: You can specify to run at highest
- - Bobby: WIll break schedRT?
- - Juri: You're not in control of OPP, so can break
- - Luca: EAS has no plan to mix scheduler with peripheral frequency or memory on SOC. Is there any future work?
- - Juri: Mike will discuss tomorrow about cpufreq integration
- - Luca: Will monitor memory bandwidth?
- - Juri: Suggest to ask tomorrow
Improving big.LITTLE on Android (Todd Kjos) 7:20
- This is more a set of observations when looking at big.LITTLE in their team
- Tim couldn't make it. Todd is on Android system team
- A53/A53 => 2 different frequencies of A53
- Gives scheduler notion of up-migration or down-migration
- Fairly complex tuning
- Graph #1 ("Good perf/W curve"): Looking ath this SOC looks pretty good to tune for this performance workload.
- - You have to choose what model you're going to tune.
- - Little cores (blue line) can get better performance per watt if you up-migrate - get a pretty smooth transition & HMP can work pretty well.
- - This is not how most of them look like -> Graph #2 ("Not so good perf/W curve")
- - Are we willing to make the enormous power cost to jump that gap?
- Problems
- - Typically we find it's tuned so that you don't jump to big core unless the A53 are completely saturated & you decide it's a huge task running for a while that way (e.g. some heavy rendering situation)
- Improvements in Android M
- - ActivityManager is the entity watching what's interacting w/ user right now
- - we already have notion of foreground & background, so have foreground CPU sets & background cpusets. Want background tasks to be jailed on little or CPU 0 - don't run on big cores, save those for foreground
- - Configured in device-specific initr.cr file
- Bobby: Confusion about in ARM: are async tasks always background?
- - Todd: THink it depends what they're interacting with
- - Riley: Not all are background
- - Bobby: THere's an API called AsyncTask
- - Riley: Don't force into background
- - Rom: Matters if the task your AsyncTask is running with is foreground
- Mark: HOw much real work in background tasks - what system load?
- - Todd: Not much, don't have exact %.
- - Riley: Very intermittent. Some notion of wanting to separate foreground & background. Even if in background cpuset & de-niced, want to make sure they don't interfere w/ foreground tasks. Would like to be able to make some guarantees - future work.
- - John: Could sched round robin do something like that to elevate foreground tasks?
- - Riley: Have shyed away from making things with non-deterministic runtimes round-robin. We do de-nice background tasks but they still interfere.
- John: how does it work w/ cgroups?
- - Todd: Helps cgroups. By limiting to CPU 0, you can __
- - Rom: Without cpusets, you'll see 4 CPUs idle & just start spreading tasks on the big ones
- - Bobby: EAS should handle that
- - Samuel: __
- - Todd: There is potential for scheduling interference
- Todd: These sorts of systems will start to be a bigger focus for us in the next year